>>-table-identifier--+-----------------------+------------------> | .-,-----------. | | V | | '-(----column-name-+--)-' >--AS--(--fullselect--+-----------------+--+--------------------+--)--> '-order-by-clause-' '-fetch-first-clause-' >--+---------------+--+--------------+------------------------->< '-search-clause-' '-cycle-clause-' search-clause: .-,-----------. V | |--SEARCH--+-DEPTH FIRST---+--BY----column-name-+--SET--seq-column-name--| '-BREADTH FIRST-' cycle-clause: .-,-----------. V | |--CYCLE----column-name-+---------------------------------------> >--SET--cycle-column-name--TO--constant--DEFAULT--constant------> >--+--------------------------+---------------------------------| '-USING--using-column-name-'
A common-table-expression permits defining a result table with a table-identifier that can be specified as a table name in any FROM clause of the fullselect that follows. The table-identifier must be unqualified. Multiple common table expressions can be specified following the single WITH keyword. Each common table expression specified can also be referenced by name in the FROM clause of subsequent common table expressions.
If a list of columns is specified, it must consist of as many names as there are columns in the result table of the fullselect. Each column-name must be unique and unqualified. If these column names are not specified, the names are derived from the select list of the subselect used to define the common table expression.
The table-identifier of a common table expression must be different from any other common table expression table-identifier in the same statement. A common table expression table-identifier can be specified as a table name in any FROM clause throughout the fullselect. A table-identifier of a common table expression overrides any existing table, view, or alias (in the catalog) with the same unqualified name.
If more than one common table expression is defined in the same statement, cyclic references between the common table expressions are not permitted. A cyclic reference occurs when two common table expressions dt1 and dt2 are created such that dt1 refers to dt2 and dt2 refers to dt1.
The table name of a common table expression can only be referenced in the select-statement, INSERT statement, or CREATE VIEW statement that defines it.
If a select-statement, INSERT statement, or CREATE VIEW statement refers to an unqualified table name, the following rules are applied to determine which table is actually being referenced:
A common-table-expression can be used:
If a fullselect of a common table expression contains a reference to itself in a FROM clause, the common table expression is a recursive table expression. Queries using recursion are useful in supporting applications such as bill of materials (BOM), reservation systems, and network planning.
The following restrictions apply to a recursive common-table-expression:
If a column name of the common-table-expression is referred to in the iterative fullselect, the attributes of the result columns are determined using the rules for result columns. For more information see Rules for result data types.
The column-name must identify a column name of the recursive common-table-expression. The column-name must not be qualified.
The seq-column-name may only be referenced in the ORDER BY clause of the outer fullselect that references the common-table-expression. The seq-column-name cannot be referenced in the fullselect that defines the common-table-expression.
The seq-column-name must not be the same as using-column-name or cycle-column-name.
Each column-name must identify a result column of the common table expression. The same column-name must not be specified more than once.
When cyclic data in the row is encountered, the duplicate row is not returned to the recursive query process for further recursion and that child branch of the query is stopped. By specifying the provided cycle-column-name is in the result set of the main fullselect, the existence of cyclic data can actually be determined and even corrected if that is desired.
The cycle-column-name must not be the same as using-column-name or seq-column-name.
The cycle-column-name can be referenced in the fullselect that defines the common-table-expression.
The using-column-name must not be the same as cycle-column-name or seq-column-name.
Recursive common table expressions are not allowed if the query specifies:
Bill of materials (BOM) applications are a common requirement in many business environments. To illustrate the capability of a recursive common table expression for BOM applications, consider a table of parts with associated subparts and the quantity of subparts required by the part. For this example, create the table as follows:
CREATE TABLE PARTLIST ( PART VARCHAR(8), SUBPART VARCHAR(8), QUANTITY INTEGER )
To give query results for this example, assume that the PARTLIST table is populated with the following values:
PART SUBPART QUANTITY -------- -------- ----------- 00 01 5 00 05 3 01 02 2 01 03 3 01 04 4 01 06 3 02 05 7 02 06 6 03 07 6 04 08 10 04 09 11 05 10 10 05 11 10 06 12 10 06 13 10 07 14 8 07 12 8
The first example is called single level explosion. It answers the question, "What parts are needed to build the part identified by '01'?". The list will include the direct subparts, subparts of the subparts and so on. However, if a part is used multiple times, its subparts are only listed once.
WITH RPL (PART, SUBPART, QUANTITY) AS ( SELECT ROOT.PART, ROOT.SUBPART, ROOT.QUANTITY FROM PARTLIST ROOT WHERE ROOT.PART = '01' UNION ALL SELECT CHILD.PART, CHILD.SUBPART, CHILD.QUANTITY FROM RPL PARENT, PARTLIST CHILD WHERE PARENT.SUBPART = CHILD.PART ) SELECT DISTINCT PART, SUBPART, QUANTITY FROM RPL ORDER BY PART, SUBPART, QUANTITY
The above query includes a common table expression, identified by the name RPL, that expresses the recursive part of this query. It illustrates the basic elements of a recursive common table expression.
The first operand (fullselect) of the UNION, referred to as the initialization fullselect, gets the direct children of part '01'. The FROM clause of this fullselect refers to the source table and will never refer to itself (RPL in this case). The result of this first fullselect goes into the common table expression RPL (Recursive PARTLIST). As in this example, the UNION must always be a UNION ALL.
The second operand (fullselect) of the UNION uses RPL to compute subparts of subparts by having the FROM clause refer to the common table expression RPL and the source table with a join of a part from the source table (child) to a subpart of the current result contained in RPL (parent). The result goes back to RPL again. The second operand of UNION is then used repeatedly until no more children exist.
The SELECT DISTINCT in the main fullselect of this query ensures the same part/subpart is not listed more than once.
The result of the query is as follows:
PART SUBPART QUANTITY -------- -------- ----------- 01 02 2 01 03 3 01 04 4 01 06 3 02 05 7 02 06 6 03 07 6 04 08 10 04 09 11 05 10 10 05 11 10 06 12 10 06 13 10 07 12 8 07 14 8
Observe in the result that from part '01' we go to '02' which goes to '06' and so on. Further, notice that part '06' is reached twice, once through '01' directly and another time through '02'. In the output, however, its subcomponents are listed only once (this is the result of using a SELECT DISTINCT) as required.
The second example is a summarized explosion. The question posed here is, what is the total quantity of each part required to build part '01'. The main difference from the single level explosion is the need to aggregate the quantities. The first example indicates the quantity of subparts required for the part whenever it is required. It does not indicate how many of the subparts are needed to build part '01'.
WITH RPL (PART, SUBPART, QUANTITY) AS ( SELECT ROOT.PART, ROOT.SUBPART, ROOT.QUANTITY FROM PARTLIST ROOT WHERE ROOT.PART = '01' UNION ALL SELECT PARENT.PART, CHILD.SUBPART, PARENT.QUANTITY*CHILD.QUANTITY FROM RPL PARENT, PARTLIST CHILD WHERE PARENT.SUBPART = CHILD.PART ) SELECT PART, SUBPART, SUM(QUANTITY) AS "Total QTY Used" FROM RPL GROUP BY PART, SUBPART ORDER BY PART, SUBPART
In the above query, the select list of the second operand of the UNION in the recursive common table expression, identified by the name RPL, shows the aggregation of the quantity. To find out how much of a subpart is used, the quantity of the parent is multiplied by the quantity per parent of a child. If a part is used multiple times in different places, it requires another final aggregation. This is done by the grouping over the common table expression RPL and using the SUM aggregate function in the select list of the main fullselect.
The result of the query is as follows:
PART SUBPART Total Qty Used -------- -------- -------------- 01 02 2 01 03 3 01 04 4 01 05 14 01 06 15 01 07 18 01 08 40 01 09 44 01 10 140 01 11 140 01 12 294 01 13 150 01 14 144
Looking at the output, consider the line for subpart '06'. The total quantity used value of 15 is derived from a quantity of 3 directly for part '01' and a quantity of 6 for part '02' which is needed 2 times by part '01'.
The question may come to mind, what happens when there are more levels of parts in the table than you are interested in for your query? That is, how is a query written to answer the question, "What are the first two levels of parts needed to build the part identified by '01'?" For the sake of clarity in the example, the level is included in the result.
WITH RPL ( LEVEL, PART, SUBPART, QUANTITY) AS ( SELECT 1, ROOT.PART, ROOT.SUBPART, ROOT.QUANTITY FROM PARTLIST ROOT WHERE ROOT.PART = '01' UNION ALL SELECT PARENT.LEVEL+1, CHILD.PART, CHILD.SUBPART, CHILD.QUANTITY FROM RPL PARENT, PARTLIST CHILD WHERE PARENT.SUBPART = CHILD.PART AND PARENT.LEVEL < 2 ) SELECT PART, LEVEL, SUBPART, QUANTITY FROM RPL
This query is similar to example 1. The column LEVEL was introduced to count the levels from the original part. In the initialization fullselect, the value for the LEVEL column is initialized to 1. In the subsequent fullselect, the level from the parent is incremented by 1. Then to control the number of levels in the result, the second fullselect includes the condition that the parent level must be less than 2. This ensures that the second fullselect only processes children to the second level.
The result of the query is:
PART LEVEL SUBPART QUANTITY -------- ----------- -------- ----------- 01 1 02 2 01 1 03 3 01 1 04 4 01 1 06 3 02 2 05 7 02 2 06 6 03 2 07 6 04 2 08 10 04 2 09 11 06 2 12 10 06 2 13 10
(C) Copyright IBM Corporation 1992, 2006. All Rights Reserved.