DB2 Server for VSE & VM: SQL Reference


Chapter 5. Queries

A query specifies a result table.

In a program, a query is a component of certain SQL statements. There are three forms of a query:

There is another form of select, described under SELECT INTO.


Authorization

For any form of a query, the privileges held by the authorization ID of the statement must include at least one of the following for each of the tables or views identified in the statement:


subselect



>>-select_clause-from_clause-+--------------+--+-----------------+--+---------------+-->
                             '-where_clause-'  '-group_by_clause-'  '-having_clause-'
 
>--------------------------------------------------------------><
 

The subselect is a component of the fullselect, the CREATE VIEW statement, and the INSERT statement. It is also a component of certain predicates which, in turn, are components of a subselect. A subselect that is a component of a predicate is called a subquery. If more than one table or view is identified in the FROM clause, the subselect is called a join. (See the DB2 Server for VSE & VM Application Programming manual for information on joining tables.)

A subselect specifies a result table derived from the tables or views identified in the FROM clause. The derivation can be described as a sequence of operations in which the result of each operation is input for the next. (This is only a way of describing the subselect. The method used to perform the derivation may be quite different from this description.)

The sequence of the (hypothetical) operations is:

  1. FROM clause
  2. WHERE clause
  3. GROUP BY clause
  4. HAVING clause
  5. SELECT clause.

select-clause



             .-ALL------.
>>-SELECT----+----------+---+-*----------------------------+---><
             '-DISTINCT-'   |  .-,-----------------------. |
                            |  V                         | |
                            '----+-expression---------+--+-'
                                 +-table_name.*-------+
                                 +-view_name.*--------+
                                 '-correlation_name.*-'
 

The SELECT clause specifies the columns of the final result table. The column values are produced by the application of the select list to R. The select list is the names or expressions specified in the SELECT clause. R is the result of the previous operation of the subselect. For example, if the only clauses specified are SELECT, FROM, and WHERE, R is the result of that WHERE clause.

ALL
Retains all rows of the final result table and does not eliminate redundant duplicates. This is the default.

DISTINCT
Eliminates all but one of each set of duplicate rows of the final result table. DISTINCT must not be used more than once in a subselect. This restriction includes SELECT DISTINCT and the use of DISTINCT in a column function of the select list or HAVING clause, but does not include subqueries of the subselect.

Two rows are duplicates of one another only if each value in the first row is equal to the corresponding value in the second row. (For determining duplicate rows, two null values are considered equal.)

Select List Notation

Represents a list of names that identify the columns of table R. The first name in the list identifies the first column of R, the second name identifies the second column of R, and so on.

The list of names is established when the statement containing the SELECT clause is prepared. Hence, * does not identify any columns that have been added to a table after the statement has been prepared.

expression
Can be any expression of the type described in Chapter 3, "Language Elements". Each column name used in the select list must unambiguously identify a column of R. The operand of an operator must not be a column function that includes the keyword DISTINCT.

name.*
Represents a list of names that identify the columns of name. The name can be a table name, view name, or correlation name, and must designate a table or view named in the FROM clause. The first name in the list identifies the first column of the table or view, the second name in the list identifies the second column of the table or view, and so on.

The list of names is established when the statement containing the SELECT clause is prepared. Hence, * does not identify any columns that have been added to a table after the statement has been prepared.

The number of columns in the result of SELECT is the same as the number of expressions in the operational form of the select list (that is, the list established during program preparation) and must not exceed 255. The result table of a subquery must be a single column, unless the subquery is used in the EXISTS predicate.

Limitation on Long String Columns

No column in the list may be a long string column if:

Note:

The restriction does not apply to a subquery of an EXISTS predicate because an EXISTS subquery does not return values.

Applying the Select List

Some of the results of applying the select list to R depend on whether GROUP BY or HAVING is used. Those results are described separately.

If GROUP BY or HAVING is used:

If neither GROUP BY nor HAVING is used

In either case the nth column of the result contains the values specified by applying the nth expression in the operational form of the select list.

Null attribute of result columns

Result columns allow null values if they are derived from:

Names of result columns

A result column derived from a column name acquires the unqualified name of that column. All other result columns have no names.

Data type of result columns

Each column of the result of SELECT acquires a data type from the expression from which it is derived.
When the expression is ... The data type of the result column is ...
the name of any numeric column the same as the data type of the column, with the same precision and scale for decimal columns.
an integer constant INTEGER.
a decimal or floating-point constant the same as the data type of the constant, with the same precision and scale for decimal constants. For floating-point constants, the data type is DOUBLE PRECISION.
the name of any numeric host variable the same as the data type of the variable, with the same precision and scale for decimal variables. If the data type of the variable is not identical to an SQL data type (for example, DISPLAY SIGN LEADING SEPARATE in COBOL), the result column is decimal.
an arithmetic or string expression the same as the data type of the result. Decimal results have the same precision and scale as described under Expressions.
any function (see Chapter 4, "Functions" to determine the data type of the result.)
the name of any string column the same as the data type of the column, with the same length attribute.
the name of any string host variable the same as the data type of the variable, with a length attribute equal to the length of the variable. If the data type of the variable is not identical to an SQL data type (for example, a NUL-terminated string in C), the result column is a varying-length string.
a character string constant of length n VARCHAR(n)
a graphic string constant of length n VARGRAPHIC(n)
the name of a datetime column the same as the data type of the column.


from-clause



           .-,----------------------------------------.
           V                                          |
>>-FROM--------+-table_name-+--+------------------+---+--------><
               '-view_name--'  '-correlation_name-'
 

The FROM clause specifies an intermediate result table. If a single table or view is identified, the intermediate result table is simply that table or view. If more than one table or view is identified, the intermediate result table consists of all possible combinations of the rows of the identified tables or views. Each row of the result is a row from the first table or view concatenated with a row from the second table or view, concatenated in turn with a row from the third, and so on. The number of rows in the result is the product of the number of rows in all the named tables or views.

The following rules apply to the names specified in a FROM clause:

If a correlation_name is specified for a table or view, any qualified reference to a column of that table or view in the subselect must use that correlation_name. For more information about the FROM clause, see Correlation Names.


where-clause



>>-WHERE--search_condition-------------------------------------><
 

The WHERE clause specifies an intermediate result table that consists of those rows of R for which the search_condition is true. R is the result of the FROM clause of the subselect.

The search_condition must conform to the following rules:

Any subquery in the search_condition is effectively processed for each row of R and the results are used in the application of the search_condition to the given row of R. A subquery is actually processed for each row of R only if it includes a correlated reference. In fact, a subquery with no correlated references is processed just once, whereas a subquery with a correlated reference may have to be processed once for each row.


group-by-clause



               .-,--------------.
               V                |
>>-GROUP BY-------column_name---+------------------------------><
 

The GROUP BY clause specifies an intermediate result table that consists of a grouping of the rows of R. R is the result of the previous clause of the subselect.

Each column_name must unambiguously identify a column of R other than a long string column. Each identified column is called a grouping column.

The result of GROUP BY is a set of groups of rows. The rows within each group are in an arbitrary order. In each group of more than one row, all values of each grouping column are equal; and all rows with the same set of values of the grouping columns are in the same group. For grouping, all null values within a grouping column are considered equal.

Because every row of a group contains the same value of any grouping column, the name of a grouping column can be used in a search condition in a HAVING clause or an expression in a SELECT clause; in each case, the reference specifies only one value for each group.

If the grouping column contains varying-length strings with trailing blanks, the values in the group can differ in the number of trailing blanks and may not all have the same length. In that case, a reference to the grouping column still specifies only one value for each group, but the value for a group is chosen arbitrarily from the available set of values. Thus, the actual length of the result value is unpredictable.

If a field procedure is not involved, the collating sequence depends on the CCSID of the application server. With the DRDA protocol, the application server could be using an ASCII CCSID, producing an unexpected result to an application program assuming an EBCDIC CCSID and code page (see the IBM SQL Reference manual for details).

GROUP BY is ignored if used in a subquery of a basic predicate.


having-clause



>>-HAVING--search_condition------------------------------------><
 

The HAVING clause specifies an intermediate result table that consists of those groups of R for which the search_condition is true. R is the result of the previous clause of the subselect. If this clause is not GROUP BY, R is considered a single group with no grouping columns.

Each column name in the search_condition must:

A group of R to which the search condition is applied supplies the argument for each column function in the search condition, except for any function whose argument is a correlated reference.

If the search condition contains a subquery, the subquery can be thought of as being proceesed each time the search condition is applied to a group of R, and the results used in applying the search condition. In actuality, the subquery is processed for each group only if it contains a correlated reference. For an illustration of the difference, see examples 6 and 7 under Examples of a subselect.

A correlated reference to a group of R must either identify a grouping column or be contained within a column function.

The HAVING clause must not be used in a subquery of a basic predicate.


Examples of a subselect

Example 1

Select all columns and rows from the EMPLOYEE table.

  SELECT * FROM EMPLOYEE

Example 2

Join the EMP_ACT and EMPLOYEE tables, select all the columns from the EMP_ACT table and add the employee's surname (LASTNAME) from the EMPLOYEE table to each row of the result.

SELECT EMP_ACT.*, LASTNAME
  FROM EMP_ACT, EMPLOYEE
  WHERE EMP_ACT.EMPNO = EMPLOYEE.EMPNO

Example 3

Using a join-condition on the EMPLOYEE and DEPARTMENT tables, select the employee number (EMPNO), employee name (FIRSTNME concatenated with MIDINIT concatenated with LASTNAME), department number (WORKDEPT in the EMPLOYEE table and DEPTNO in the DEPARTMENT table) and department name (DEPTNAME) of all employees who were born (BIRTHDATE) earlier than 1930.

  SELECT EMPNO, FIRSTNME CONCAT ' ' CONCAT MIDINIT CONCAT ' ' CONCAT LASTNAME,
    WORKDEPT, DEPTNAME
    FROM EMPLOYEE, DEPARTMENT
    WHERE WORKDEPT = DEPTNO
    AND YEAR(BIRTHDATE) < 1930

Example 4

Select the job (JOB) and the minimum and maximum salaries (SALARY) for each group of rows with the same job code in the EMPLOYEE table, but only for groups with more than one row and with a maximum salary greater than or equal to 27 000.

  SELECT JOB, MIN(SALARY), MAX(SALARY)
    FROM EMPLOYEE
    GROUP BY JOB
    HAVING COUNT(*) > 1 AND MAX(SALARY) >= 27000

Example 5

Select all the rows of EMP_ACT table for employees (EMPNO) in department (WORKDEPT) 'E11'. (Employee department numbers are shown in the EMPLOYEE table.)

  SELECT * FROM EMP_ACT
    WHERE EMPNO IN (SELECT EMPNO FROM EMPLOYEE
                             WHERE WORKDEPT = 'E11')

Example 6

From the EMPLOYEE table, select the department number (WORKDEPT) and maximum departmental salary (SALARY) for all departments whose maximum salary is less than the average salary for all employees.

  SELECT WORKDEPT, MAX(SALARY)
    FROM EMPLOYEE
    GROUP BY WORKDEPT
    HAVING MAX(SALARY) < (SELECT AVG(SALARY)
                                   FROM EMPLOYEE)

The subquery in the HAVING clause would only be processed once in this example.

Example 7

Using the EMPLOYEE table, select the department number (WORKDEPT) and maximum departmental salary (SALARY) for all departments whose maximum salary is less than the average salary in all other departments.

  SELECT WORKDEPT, MAX(SALARY)
    FROM EMPLOYEE EMP_COR
    GROUP BY WORKDEPT
    HAVING MAX(SALARY) < (SELECT AVG(SALARY)
                                   FROM EMPLOYEE
                                   WHERE NOT WORKDEPT = EMP_COR.WORKDEPT)
Note:In contrast to example 6, the subquery in the HAVING clause would need to be executed for each group.

fullselect



>>-+-subselect----+--------------------------------------------->
   '-(fullselect)-'
 
      .------------------------------------------.
      V                                          |
>--------+------------------------------------+--+-------------><
         '--+-UNION-----+---+-subselect----+--'
            '-UNION ALL-'   '-(fullselect)-'
 

A fullselect specifies a result table. If UNION is not used, the result of the fullselect is the result of the specified subselect.

UNION  or  UNION ALL
Derives a result table by combining two other result tables (R1 and R2). If UNION ALL is specified, the result consists of all rows in R1 and R2. If UNION is specified without the ALL option, the result is the set of all rows in either R1 or R2, with duplicate rows eliminated. In either case, each row of the UNION table is either a row from R1 or a row from R2. The columns of the result are not named in the SQLDA.

Two rows are duplicates if each value in the first is equal to the corresponding value of the second. (For determining duplicates, two null values are considered equal.)

UNION and UNION ALL are associative operations. However, when UNION and UNION ALL are used in the same statement, the result depends on the order in which the operations are performed. Operations within parentheses are performed first. When the order is not specified by parentheses, operations are performed in left-to-right order.

Rules for columns

R1 and R2 must have the same number of columns, and the data type of the nth column of R1 must be compatible with the data type of the nth column of R2.
 
R1 and R2 must not include long string columns.
 
The nth column of the result of UNION and UNION ALL is derived from the nth columns of R1 and R2. The following table shows all valid combinations of operand columns and, for each combination, the data type of the result column.

If one operand column is... And the other operand is... The data type of the result column is...
CHAR(x) CHAR(y) CHAR(z) where z = max(x,y)
VARCHAR(x) CHAR(y) or VARCHAR(y) VARCHAR(z) where z = max(x,y)
bit data mixed, SBCS, or bit data bit data
mixed data mixed or SBCS data mixed data
SBCS data SBCS data SBCS data
GRAPHIC(x) GRAPHIC(y) GRAPHIC(z) where z = max(x,y)
VARGRAPHIC(x) GRAPHIC(y) or VARGRAPHIC(y) VARGRAPHIC(z) where z = max(x,y)
DATE DATE DATE
TIME TIME TIME
TIMESTAMP TIMESTAMP TIMESTAMP
FLOAT (double) any numeric type FLOAT (double)
FLOAT (single) FLOAT (single) FLOAT (single)
FLOAT (single) DECIMAL, NUMERIC, INTEGER, or SMALLINT FLOAT (double)
DECIMAL(w,x) DECIMAL(y,z) or NUMERIC(y,z,) DECIMAL(p,s) where p = max(x,z)+max(w-x,y-z) s = max(x,z) (see note below)
DECIMAL(w,x) INTEGER DECIMAL(p,x) where p = x+max(w-x,11) (see note below)
DECIMAL(w,x) SMALLINT DECIMAL(p,x) where p = x+max(w-x,5) (see note below)
INTEGER INTEGER INTEGER
INTEGER SMALLINT INTEGER
SMALLINT SMALLINT SMALLINT
Note:A decimal result column must not have a precision greater than 31.
If neither operand column allows nulls, the result column does not allow nulls. Otherwise, the result column allows nulls. If the description of any operand column is not the same as the description of the result column, its values are converted to conform to the description of the result column.
The conversion operation is exactly the same as if the values were assigned to the result column. For example, if one operand column is CHAR(10), and the other operand column is CHAR(5), the result column is CHAR(10), and the values derived from the CHAR(5) column are padded on the right with five blanks.

Examples of a fullselect

Example 1

Select all columns and rows from the EMPLOYEE table.

  SELECT * FROM EMPLOYEE

Example 2

List the employee numbers (EMPNO) of all employees in the EMPLOYEE table whose department number (WORKDEPT) either begins with 'E' or who are assigned to projects in the EMP_ACT table whose project number (PROJNO) begins with either 'MA2100', 'MA2110', or 'MA2112'.

  SELECT EMPNO FROM EMPLOYEE
    WHERE WORKDEPT LIKE 'E%'
  UNION
  SELECT EMPNO FROM EMP_ACT
    WHERE PROJNO IN('MA2100', 'MA2110', 'MA2112')

Example 3

Make the same query as in example 2, and, in addition, "tag" the rows from the EMPLOYEE table with 'emp' and the rows from the EMP_ACT table with 'emp_act'.

  SELECT EMPNO, 'emp' FROM EMPLOYEE
    WHERE WORKDEPT LIKE 'E%'
  UNION
  SELECT EMPNO, 'emp_act' FROM EMP_ACT
    WHERE PROJNO IN('MA2100', 'MA2110', 'MA2112')

Example 4

Make the same query as in example 2, only use UNION ALL so that no duplicate rows are eliminated.

  SELECT EMPNO FROM EMPLOYEE
    WHERE WORKDEPT LIKE 'E%'
  UNION ALL
  SELECT EMPNO FROM EMP_ACT
    WHERE PROJNO IN('MA2100', 'MA2110', 'MA2112')

Conversion Rules for Operations that Combine Strings

The operations that combine strings are concatenation, UNION, and UNION ALL. These rules also apply to the VALUE scalar function. In each case, the CCSID of the result is determined at bind time, and the execution of the operation may involve conversion of strings to the coded character set identified by that CCSID.

The CCSID of the result is determined by the CCSIDs of the operands. The CCSIDs of the first two operands determine an intermediate result CCSID, this CCSID and the CCSID of the next operand determine a new intermediate result CCSID, and so on. The last intermediate result CCSID and the CCSID of the last operand determine the CCSID of the result string or column. For each pair of CCSIDs, the result CCSID is determined by the sequential application of the following rules:

An operand of concatenation or the selected argument of the VALUE scalar function is converted, if necessary, to the coded character set of the result string. Each string of an operand of UNION or UNION ALL is converted, if necessary, to the coded character set of the result column. Character conversion is necessary only if all of the following are true:

An error occurs if a character of a string cannot be converted or if the CCSID Conversion Selection Table is used but does not contain any information about the CCSID pair. A warning occurs if a character of a string is converted to the substitution character.

Examples

Example 1

Given the following:
Expression Type CCSID
COL_1 column 00001
HV_2 host variable 00002
COL_3 column 00003

When evaluating the predicate:

  COL_1 CONCAT :HV_2 CONCAT COL_3

The resulting CCSID of the first two operands is 00001. Because the result of the first concatenation is a derived string, the second concatenation will have a result CCSID of 00003 (the column CCSID is chosen over the CCSID of the derived string). The final CCSID is 00003.

Example 2

Using the information from the previous example, when evaluating the predicate:

  VALUE(COL_1, :HV_2, COL_3)

The resulting CCSID of the first two operands is 00001. However, the expression type is column, not derived string, since the two operands are not combined as in the concatenation example. Using the rules for the intermediate CCSID and the third operand's CCSID, 00001 (the intermediate CCSID) is chosen as the final CCSID. This is because the CCSID of the first column is chosen over the CCSID of the second column.


select-statement



>>-fullselect----+------------------+---+-------------+--------><
                 +-order_by_clause--+   '-with_clause-'
                 '-update_clause----'
 

The select-statement is the form of a query that can be directly specified in a DECLARE CURSOR statement, or prepared and then referenced in a DECLARE CURSOR statement. It can also be issued interactively, causing a result table to be displayed at your terminal. In either case, the table specified by a select-statement is the result of the fullselect.

order-by-clause



               .-,-----------------------------.
               V                     .-ASC--.  |
>>-ORDER BY--------+-column_name-+---+------+--+---------------><
                   '-integer-----'   '-DESC-'
 

The ORDER BY clause specifies an ordering of the rows of the result table. If a single column is identified, the rows are ordered by the values of that column. If more than one column is identified, the rows are ordered by the values of the first identified column, then by the values of the second identified column, and so on. A long string column must not be identified.

A named column may be identified by an integer or a column_name. An unnamed column must be identified by an integer. A column is unnamed if it is derived from a constant, an arithmetic expression, or a function. If the fullselect includes a UNION operator, every column of the result table is unnamed.

column_name
Must unambiguously identify a column of the result table. Although columns not included in the result table cannot be referenced in the ORDER BY clause, the rules for unambiguous column references are the same as in the other clauses of the fullselect. See Column Name Qualifiers to Avoid Ambiguity for more information

integer
Must be greater than 0 and not greater than the number of columns in the result table. The integer n identifies the nth column of the result table.

ASC
Uses the values of the column in ascending order. This is the default.

DESC
Uses the values of the column in descending order.

Ordering is performed in accordance with the comparison rules described in Chapter 3, "Language Elements". The null value is higher than all other values. If your ordering specification does not determine a complete ordering, rows with duplicate values of the last identified column have an arbitrary order. If the ORDER BY clause is not specified, the rows of the result table have an arbitrary order.

If a field procedure is not involved, the collating sequence depends on the CCSID of the application server. With the DRDA protocol, the application server could be using an ASCII CCSID, producing an unexpected result to an application program assuming an EBCDIC CCSID and code page (see the IBM SQL Reference for details).

update-clause



                  .-,--------------.
                  V                |
>>-FOR UPDATE OF-----column_name---+---------------------------><
 

The UPDATE clause identifies the columns that can be updated in a subsequent Positioned UPDATE statement. Each column_name must be unqualified and must identify a column of the table or view identified in the first FROM clause of the fullselect. The clause must not be specified if the result table of the fullselect is read-only.

If an UPDATE clause is specified, only the columns identified in that clause can be updated in subsequent Positioned UPDATE statements.

If a dynamically prepared select-statement does not include an UPDATE clause, its associated cursor is not updateable.

The use of this clause for statically prepared select-statements and statically prepared Positioned UPDATE statements depends on whether the NOFOR preprocessor option is in effect. If the UPDATE clause is not specified:

See the DB2 Server for VSE & VM Application Programming manual for information on preprocessing and running programs.

If blocking is not in effect, a row may be deleted from a non-read-only table if the FOR UPDATE OF clause is specified.

If blocking is in effect and you intend to perform a positioned delete operation, blocking must be explicitly turned off by specifying the FOR UPDATE OF clause in the DECLARE statement.

Notes:

  1. There is no corresponding FOR DELETE OF clause.

  2. The order-by clause is not allowed with the update-clause.

with-clause



>>-WITH----+-RR-+----------------------------------------------><
           +-CS-+
           '-UR-'
 

The WITH clause specifies the isolation level at which the statement is executed.

RR
Repeatable read

CS
Cursor stability

UR
Uncommitted read

WITH UR can be specified only if the result table is read-only.

The isolation level specified on the SELECT statement will override any other isolation level specification; for example, in ISQL, if SET ISOLATION CS has been specified, and a SELECT statement WITH UR is executed, that statement will use an isolation level of uncommitted read. A SELECT statement without the WITH clause will use an isolation level of CS, as defined by the SET ISOLATION statement. As another example, a statement specifying WITH UR in a package prepped with ISOL(CS) will use an isolation level of uncommitted read.

If a SELECT statement specifying WITH UR is used with a cursor that is not read-only, SQLCODE -173 will be returned indicating that WITH UR cannot be specified on a select statement used in a non-read-only cursor.

Examples of a select-statement

Example 1

Select all columns and rows from the EMPLOYEE table.

  SELECT * FROM EMPLOYEE

Example 2

Select the project name (PROJNAME), start date (PRSTDATE), and end date (PRENDATE) from the PROJECT table. Order the result table by the end date with the most recent dates appearing first.

  SELECT PROJNAME, PRSTDATE, PRENDATE
    FROM PROJECT
    ORDER BY PRENDATE DESC

Example 3

Select the department number (WORKDEPT) and average departmental salary (SALARY) for all departments in the EMPLOYEE table. Arrange the result table in ascending order by average departmental salary.

  SELECT WORKDEPT, AVG(SALARY)
    FROM EMPLOYEE
    GROUP BY WORKDEPT
    ORDER BY 2

Example 4

Declare a cursor named UP_CUR to be used in a PL/I program to update the start date (PRSTDATE) and the end date (PRENDATE) columns in the PROJECT table. The program must receive both of these values together with the project number (PROJNO) value for each row.

  EXEC SQL  DECLARE UP_CUR CURSOR FOR
              SELECT PROJNO, PRSTDATE, PRENDATE
                FROM PROJECT
                FOR UPDATE OF PRSTDATE, PRENDATE;


[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]