OLAP specifications

OLAP-specification:
|--+-ranking-specification---+----------------------------------|
'-numbering-specification-'
ranking-specification:
|--+-RANK-------+--(--)--OVER--(--+-------------------------+--window-order-clause--)--|
'-DENSE_RANK-' '-window-partition-clause-'
numbering-specification:
|--ROW_NUMBER--(--)--OVER--(--+-------------------------+--+---------------------+--)--|
'-window-partition-clause-' '-window-order-clause-'
window-partition-clause:
.-,-----------------------.
V |
|--PARTITION BY----partitioning-expression-+--------------------|
window-order-clause:
|--ORDER BY----------------------------------------------------->
.-,--------------------------------------------------.
| .-NULLS LAST-. |
V .-ASC--+------------+---. |
>----+-sort-key-expression--+-----------------------+-+-+-------|
| +-ASC NULLS FIRST-------+ |
| | .-NULLS FIRST-. | |
| +-DESC--+-------------+-+ |
| '-DESC NULLS LAST-------' |
'-ORDER OF--table-designator---------------------'
On-Line Analytical Processing (OLAP) specifications
provide the ability to return ranking, row numbering, and existing aggregate
function information as a scalar value in a query result. An OLAP specification
can be included in an expression in a select-list or the ORDER BY
clause of a select-statement. The query result to which the OLAP
specification is applied is the result table of the innermost subselect that
includes the OLAP specification.
An OLAP specification is not valid in a WHERE, VALUES, GROUP
BY, HAVING, or SET clause, and an OLAP specification is not valid in the JOIN
ON join-condition. An OLAP specification cannot be used as an argument
of an aggregate function.
When invoking an OLAP specification, a window is specified that
defines the rows over which the function is applied, and in what order. When
used with an aggregate function, the applicable rows can be further refined,
relative to the current row, as either a range or a number of rows preceding
and following the current row. For example, within a partition by month, an
average can be calculated over the previous three month period.
The data type of the result of RANK, DENSE_RANK, or ROW_NUMBER is BIGINT.
The result cannot be null.
- RANK or DENSE_RANK
- Specifies that the ordinal rank of a row within the window is to be
computed. Rows that are not distinct with respect to the ordering within
their window are assigned the same rank. The results of ranking may be defined
with or without gaps in the numbers resulting from duplicate values.
- RANK
- Specifies that the rank of a row is defined as 1 plus the number of
rows that strictly precede the row. Thus, if two or more rows are not distinct
with respect to the ordering, then there will be one or more gaps in the sequential
rank numbering.
- DENSE_RANK
- Specifies that the rank of a row is defined as 1 plus the number of
preceding rows that are distinct with respect to the ordering. Therefore,
there will be no gaps in the sequential rank numbering.
- ROW_NUMBER
- Specifies that a sequential row number is to be computed for the row
within the window defined by the ordering, starting with 1 for the first row.
If the ORDER BY clause is not specified in the window, the row numbers are
assigned to the rows in arbitrary order, as returned by the subselect (not
according to any ORDER BY clause in the (select-statement).
- PARTITION BY (partitioning-expression,...)
- Defines the partition within which the function is applied. A partitioning-expression is an expression used in defining the partitioning of the result set.
Each column name referenced in a partitioning-expression must unambiguously
reference a result set column of the OLAP specification subselect statement.
A partitioning-expression cannot include a scalar-fullselect or any function that is not deterministic or has an external action.
- ORDER BY (sort-key-expression,...)
- Defines the ordering of rows within a partition that determines the
value of the OLAP specification (it does not define the ordering of the query
result set).
- sort-key-expression
- An expression used in defining the ordering of the rows within a window
partition. Each column name referenced in a sort-key-expression must
unambiguously reference a column of the result set of the subselect, including
the OLAP specification. A sort-key-expression cannot include a scalar-fullselect or any function that is not deterministic or that has
an external action. This clause is required for the RANK and DENSE_RANK functions.
- ASC
- Uses the values of the sort-key-expression in ascending order.
- DESC
- Uses the values of the sort-key-expression in descending order.
- NULLS FIRST
- The window ordering considers null values before all non-null values
in the sort order.
- NULLS LAST
- The window ordering considers null values after all non-null values
in the sort order.
- ORDER OF table-designator
- Specifies that the same ordering used in table-designator should
be applied to the result table of the subselect. There must be a table reference
matching table-designator in the FROM clause of the subselect that
specifies this clause and the table reference must identify a nested-table-expression or common-table-expression. The subselect (or fullselect) corresponding
to the specified table-designator must include an ORDER BY clause
that is dependent on the data. The ordering that is applied is the same as
if the columns of the ORDER BY clause in the nested subselect (or fullselect)
were included in the outer subselect (or fullselect), and these columns were
specified in place of the ORDER OF clause.
An OLAP specification is not allowed if the query specifies:
- lateral correlation,
- a sort sequence,
- an operation that requires CCSID conversion,
- a LOWER, TRANSLATE, or UPPER scalar function,
- a UTF-8 or UTF-16 argument in a CHARACTER_LENGTH, POSITION, or SUBSTRING
scalar function,
- a table function,
- a distributed table,
- a table with a read trigger,
- a table referenced directly or indirectly in the fullselect
must not be a DDS-created logical file, or
- a logical file built over multiple physical file members.
Notes
Syntax alternatives: DENSERANK can be specified in
place of DENSE_RANK, and ROWNUMBER can be specified in place of ROW_NUMBER.
Examples
- Display the ranking of employees, in order by surname, according to their
total salary (based on salary plus bonus) that have a total salary more than
$30,000:
SELECT EMPNO, LASTNAME, FIRSTNME, SALARY+BONUS AS TOTAL_SALARY,
RANK() OVER (ORDER BY SALARY+BONUS DESC) AS RANK_SALARY
FROM EMPLOYEE
WHERE SALARY+BONUS > 30000
ORDER BY LASTNAME
Note that
if the result is to be ordered by the ranking, then replace ORDER BY LASTNAME
with:
ORDER BY RANK_SALARY
or:
ORDER BY RANK() OVER (ORDER BY SALARY+BONUS DESC)
- Rank the departments according to their average total salary:
SELECT WORKDEPT, AVG(SALARY+BONUS) AS AVG_TOTAL_SALARY,
RANK() OVER (ORDER BY AVG( SALARY+BONUS) DESC) AS RANK_AVG_SAL
FROM EMPLOYEE
GROUP BY WORKDEPT
ORDER BY RANK_AVG_SAL
- Rank the employees within a department according to their education level.
Having multiple employees with the same rank in the department should not
increase the next ranking value:
SELECT WORKDEPT, EMPNO, LASTNAME, FIRSTNME, EDLEVEL,
DENSE_RANK() OVER (PARTITION BY WORKDEPT ORDER BY EDLEVEL DESC) AS RANK_EDLEVEL
FROM EMPLOYEE
ORDER BY WORKDEPT, LASTNAME
- Provide row numbers in the result of a query:
SELECT ROW_NUMBER() OVER (ORDER BY WORKDEPT, LASTNAME ) AS NUMBER,
LASTNAME, SALARY
FROM EMPLOYEE
ORDER BY WORKDEPT, LASTNAME
- List the top five wage earners:
SELECT EMPNO, LASTNAME, FIRSTNME, TOTAL_SALARY, RANK_SALARY
FROM (SELECT EMPNO, LASTNAME, FIRSTNME, SALARY+BONUS AS TOTAL_SALARY,
RANK() OVER (ORDER BY SALARY+BONUS DESC) AS RANK_SALARY
FROM EMPLOYEE) AS RANKED_EMPLOYEE
WHERE RANK_SALARY < 6
ORDER BY RANK_SALARY
Note
that a nested table expression was used to first compute the result, including
the rankings, before the rank could be used in the WHERE clause. A common
table expression could also have been used.
(C) Copyright IBM Corporation 1992, 2006. All Rights Reserved.