GROUP BY identifies a selected column to use for grouping results. It divides the data into groups by the values in the column specified, and returns one row of results for each group. You can GROUP BY more than one column name (separate column names with commas). Always place GROUP BY after FROM and WHERE in a query, and before HAVING and ORDER BY.
All selected columns without an associated aggregation must appear in the GROUP BY clause.
GROUP BY accumulates the results by group, but does not necessarily order the groups; you need ORDER BY to do that. When you retrieve multiple rows from a table, the GROUP BY, HAVING, and ORDER BY clauses can be used to indicate:
The following query selects the average salary for each department.
This query:
SELECT DEPT, AVG(SALARY) FROM Q.STAFF GROUP BY DEPT
Produces this report:
DEPT AVG(SALARY) ------ ------------------ 10 20865.8625000000 15 15482.3325000000 20 16071.5250000000 38 15457.1100000000 42 14592.2625000000 51 17218.1600000000 66 17215.2400000000 84 16536.7500000000
In the above example, GROUP BY divides the table into groups of rows with the same department number, and returns one row of results for each group. DEPT can be selected without a built-in function because it is used with GROUP BY, and because every member of each group has the same DEPT. As stated above, all column names included in a SELECT clause must either have an associated built-in function or must appear in the GROUP BY clause. For example, if DEPT is not used in the GROUP BY clause (in the example above), the list of average salaries has little meaning.
This is correct:
SELECT DEPT, AVG(SALARY), JOB FROM Q.STAFF GROUP BY DEPT, JOB
This is incorrect:
SELECT DEPT, AVG(SALARY), JOB FROM Q.STAFF GROUP BY DEPT
Generally, GROUP BY produces one row of a report for each different value of the grouping column. When there are several columns named in the GROUP BY clause, a different group of rows is produced each time a value in one of the columns changes. However, if there are null values in the column, each null value is treated as a separate group, consisting of one member.
Using GROUP BY in SQL is an alternative to using the usage code GROUP on the form (as described in GROUP Usage Code). GROUP BY provides an extension to the grouping that can be specified on the form and it allows conditional selection of data, which cannot be done on the form. For example, to see the smallest, largest, and average of total department salaries:
SELECT DEPT, SUM(SALARY), SUM(SALARY), SUM(SALARY) FROM Q.STAFF GROUP BY DEPT
NUM COLUMN HEADING USAGE --- ----------------------------------- ------- 1 DEPT 2 SUM(SALARY) MINIMUM 3 SUM(SALARY)1 AVERAGE 4 SUM(SALARY)2 MAXIMUM
The report contains four columns, of which the last three are almost identical. All three show the total salary for each department; but the final row shows the minimum, average, and maximum of the totals.
Examples:
SELECT DEPT, JOB, MIN(SALARY), MAX(SALARY) FROM Q.STAFF WHERE JOB < >'MGR' GROUP BY DEPT, JOB
SELECT YEARS, COUNT(*), AVG(SALARY) FROM Q.STAFF GROUP BY YEARS
Remember that HAVING must be used with grouped data. When the HAVING statement and the GROUP BY statement are both used, the HAVING statement must follow the GROUP BY statement.
SELECT DEPT, MIN(SALARY), MAX(SALARY), AVG(SALARY) FROM Q.STAFF WHERE JOB < > 'MGR' GROUP BY DEPT HAVING AVG(SALARY) 12000
SELECT YEARS, COUNT(*), AVG(SALARY) FROM Q.STAFF GROUP BY YEARS HAVING COUNT(*) 2