1. Get rid of all advertisements and get unlimited access to documents by upgrading to Premium Membership. Upgrade to Premium Now and also get a Premium Badge!

SQL Aggregate functions

Discussion in 'SQL PL/SQL' started by SBH, Oct 16, 2010.

  1. SBH

    SBH Forum Expert

    Messages:
    99
    Likes Received:
    6
    Trophy Points:
    160
    Location:
    Bangalore
    Objective

    The article explains the Aggregate functions in SQL.

    1. Introduction

    Apart from the single row functions like CONCAT, SUBSTR, INSTR, NVL and others, Oracle provides Aggregate functions also. Unlike single Row SQL functions, Aggregate functions operate on GROUPS. They are not data manipulation functions but derive and produce additional information from the selected columns.
    Since they operate on GROUPS, GROUP BY clause is a mandatory clause in the SELECT statement for group formation. If they operate on whole result set, GROUP BY clause can be ignored.

    Syntax:
    Code (Text):
    GROUP_BY_FUNCTION ([DISTINCT | ALL] COLUMN_NAME)
    Illustrations

    For instance, maximum salary has to be found against each job id working in a department. This implies that first all departments must be grouped. Within each group, JOB IDs of that department must be grouped, and then aggregate function would operate on these subgroups to get the maximum salary.

    [​IMG]

    Query for the above figure looks like:

    Code (SQL):
    SELECT DEPARTMENT_ID,
           JOB_ID,
           MAX(SALARY)
    FROM EMPLOYEE
    GROUP BY DEPARTMENT_ID, JOB_ID
    2. Use of DISTINCT, ALL keywords with Aggregate functions

    DISTINCT ignores duplicate values. Aggregate function uses only unique values of the column.
    ALL is the default specification. Aggregate function includes all values of a not null columns .

    3. Types of Group Functions

    Below is the list of GROUP BY functions. Each function takes an argument and returns the single output for each row.

    1. COUNT – Counts the number of records.
    2. SUM – Sum of definite column value.
    3. AVG – Average of specified column value.
    4. MAX, MIN – To find maximum and minimum values of a column.
    5. STDDEV, VARIANCE – Calculates standard deviation and variance of a column value.

    Note that except COUNT(*), all other aggregate functions ignore NULL values.

    3.1. AVG

    AVG function calculates the average of the specified column value. It works with number data type only and returns numeric output. Using with DISTINCT identifier, it returns the average of only the unique column values. Syntax
    Code (Text):
    AVG (DISTINCT/ALL column name)
    Example a : Below query calculates the average salary of all the employees in the organization

    Code (SQL):


    SQL> SELECT
    AVG (SALARY)
    FROM EMPLOYEE;

    AVG (SAL
    --------
    5763.32

     
    Note that since result set is not grouped, AVG is calculated on full organization.


    3.2. SUM

    This function sums up the numeric column values of the column based group. It operates on number data ad returns numeric output.

    Using with DISTINCT identifier, it returns the sum of only the unique column values. Syntax
    SUM (DISTINCT/ALL column name)

    Example a: Below query calculates sum of salary for each job id.

    Code (SQL):

    SQL> SELECT
    JOB_ID,
    SUM (SALARY)
    FROM EMPLOYEE
    GROUP BY JOB_ID

    JOB_ID  SUM(SALARY)
    --------    -------------------
    MGR     12000
    DEV     23500
    HR      14000
    ADM     17000

     

    3.3. MAX, MIN

    Maximum and minimum functions are used to find the maximum and minimum values of the column.
    It takes one mandatory parameter, which is a column of any SQL supported data type i.e. number, character and date.

    Syntax:
    Code (Text):
    MAX(DISTINCT/ALL Column name)
    MIN(DISTINCT/ALL Column name)
     
    Example a: Below query lists the maximum salary drawn under each job category in the company

    Code (SQL):

    SELECT
    JOB_ID,
    MAX(SALARY)
    FROM EMPLOYEE
    GROUP BY JOB_ID

    JOB_ID  MAX(SALARY)
    --------    -------------------
    MGR     8320
    DEV     12300
    HR      7150
    ADM     8270

     
    Example b: Below query returns the oldest employee in each department.

    Code (SQL):
    SELECT
    DEPARTMENT_ID,
    MIN(HIRE_DATE)
    FROM EMPLOYEE
    GROUP BY DEPARTMENT_ID
    3.4. COUNT


    Count function counts the number of rows in the result set of SELECT query. It returns zero, if no records are found in the result set.

    Example a: Below query counts the employees in each department

    Code (SQL):

    SQL> SELECT
    DEPARTMENT_ID,
    COUNT (*)
    FROM EMPLOYEE
    GROUP BY DEPARTMENT_ID
    Example b: Below query counts the employees in whole organization, whose salary exceeds 5000

    Code (SQL):
    SQL> SELECT
    COUNT(EMPLOYEE_ID)
    FROM EMPLOYEE
    WHERE SAL > 5000

    4. Handling NULL values in GROUP BY functions

    GROUP BY functions work with actual values of the column only i.e. ignoring the NULL values (COUNT(*) is exception here). But this feature creates major difference in terms of calculations, end analytic results.

    But to avoid their ignoring, handling of NULL values in the columns is necessary because ignoring of NULL values might produce incorrect results. For instance, refer the below figure and queries

    [​IMG]

    Above figure shows the difference created in the Average salary of the employees due to NULL value. In the first screen shot, four salary for employees add up to 13785. But since average ignored NULL values column, it calculated average of only 3 employees as 4595, which is wrong. Next screen shot shows the handling of NULL value. It has bee replaced by zero using NULL handling function. Now, total salary gets distirbuted in 4 employees


    5. Nesting of Group functions

    Multiple group by functions can be used together to produce meaningful results. Suppose, you require maximum average across all the departments or the department with maximum employees, in such cases comes the application of nested group function.

    Closely observing the requirement, it says Department with Maximum Count of employee.


    Code (SQL):
    SQL>SELECT
    MAX(COUNT(EMPNO))
        FROM EMP
        GROUP BY DEPARTMENT_ID
    Notes:

    • Group by function must be specified in the query.
    • Multiple single-row functions can be nested together, but group functions can be nested up to two levels only. The nested function call COUNT(SUM(AVG( X))) returns the error, “ORA-00935: group function is nested too deeply.” However, using nested single-row functions within group functions works fine. Consider the following query:


    Code (SQL):
    SELECT
    SUM(AVG(LENGTH(LAST_NAME)))
    FROM EMPLOYEES
    GROUP BY DEPARTMENT_ID
    It calculates the sum of the average length of LAST_NAME values per department.


    References
     

    Attached Files: