Administration Guide

Restrictions on Result Sets to Improve Performance

A SELECT statement defines a set of rows which satisfy the search criteria. The DB2 optimizer assumes the application will retrieve all the qualifying rows. This assumption is most appropriate in OLTP and batch environments. However, in "browse" applications it is common for a query to define a very large potential answer set but only retrieve the first few rows, typically only as many rows as are required to fill the screen.

The default assumption made by the optimizer to retrieve all qualifying rows may not be the best for applications that are not updating or deleting information from the stored data.

There are four ways of modifying the SELECT statement to limit or modify the result table to improve performance. They are:

FOR UPDATE clause
FOR READ/FETCH ONLY clause
OPTIMIZE FOR n ROWS clause
FETCH FIRST n ROWS ONLY clause.

FOR UPDATE Clause

The FOR UPDATE clause identifies the columns that can be updated by a subsequent positioned UPDATE statement. If the FOR UPDATE clause is specified without column names, all updateable-columns of the table or view are included. If column names are specified, each name must be unqualified and must identify a column of the table or view.

The FOR UPDATE clause cannot be used when either of the following are true:

The cursor associated with the SELECT statement cannot be deleted.
At least one of the selected columns is a non-updatable column of a catalog table and has not been excluded in the FOR UPDATE clause.

The DB2 CLI connection attribute SQL_ATTR_ACCESS_MODE can be used in CLI applications to achieve the same results. Refer to the SQLSetConnectAttr() section of the CLI Guide and Reference for more information.

FOR READ or FETCH ONLY Clause

The FOR READ ONLY clause ensures that the result table is read-only. The FOR FETCH ONLY clause has the same meaning.

Some result tables are read-only by definition. For example, the result table from a SELECT on a view defined as read-only. You can still specify FOR READ ONLY in such a case, but the clause has no effect.

For result tables where updates and deletes are allowed, specifying FOR READ ONLY may improve the performance of FETCH operations. This possible improvement in performance occurs when the database manager is able to do blocking, and not exclusive locks, on the data. You should use the FOR READ ONLY clause to improve performance except in cases where queries are used in positioned UPDATE or DELETE statements.

OPTIMIZE FOR n ROWS Clause

The OPTIMIZE FOR clause provides a mechanism for an application to declare its intent to retrieve only a subset of the result or to give priority to the retrieval of the first few rows. Once this intent is understood, the optimizer can give preference to access plans that minimize the response time for retrieving the first few rows. Also, the number of rows that are sent to the client as a single block (see Row Blocking) are bounded by the value of "n" in the OPTIMIZE FOR clause. Therefore, the OPTIMIZE FOR clause affects both how the qualifying rows are retrieved from the database by the server, and how the qualifying rows are returned to the client.

For example, suppose you are querying the employee table for the employees with the highest salary on a regular basis.

     SELECT LASTNAME,FIRSTNAME,EMPNO,SALARY
     FROM EMPLOYEE
     ORDER BY SALARY DESC

You have defined a descending index on the SALARY column. However, since employees are ordered by employee number, the salary index is likely to be very poorly clustered. The optimizer, in trying to avoid many random synchronous I/Os, would likely choose to use the list prefetch access method (see Understanding List Prefetching) which requires the row identifiers of all rows that qualify to be sorted. This can cause a delay before the first qualifying rows can be returned to the application. By adding the OPTIMIZE FOR clause to the statement as follows:

     SELECT LASTNAME,FIRSTNAME,EMPNO,SALARY
     FROM EMPLOYEE
     ORDER BY SALARY DESC
     OPTIMIZE FOR 20 ROWS

the optimizer would likely choose to use the SALARY index directly with the knowledge that in all likelihood only the twenty employees with the highest salaries would be retrieved. Regardless of how many rows could be blocked, a block of rows is returned to the client every twenty rows.

Use of the OPTIMIZE FOR clause causes the optimizer to favor access plans that avoid bulk operations or operations that interrupt the flow of rows, such as sorts. You are most likely to influence an access path by using OPTIMIZE FOR 1 ROW. As a result, using this clause could have the following effects:

Join sequences with composite inners are less likely since they require a temporary table.
The join method could change. A nested loop join is the most likely choice, because it has low overhead cost and is usually more efficient if you only want to retrieve a few rows.
An index that matches the ORDER BY clause is more likely to be picked. This occurs because no sort would be needed for the ORDER BY.
List prefetch is less likely to be picked since this access method requires a sort.
Sequential prefetch is less likely to be requested by DB2 because it infers that you only want to see a small number of rows.
In a join query, the table with the columns in the ORDER BY clause is likely to be picked as the outer table if there is an index on that outer table that gives the ordering needed for the ORDER BY clause.

Although the OPTIMIZE FOR clause applies to all optimization classes (see Adjusting the Optimization Class), it works best for optimization class 3 and higher. The use of the greedy join enumeration method (see Search Strategies for Selecting Optimal Join) in optimization classes below 3 sometimes results in access plans for multi-table joins that do not lend themselves to quickly retrieving the first few rows.

The OPTIMIZE FOR clause does not prevent you from retrieving all the qualifying rows. However the total elapsed time to retrieve all the qualifying rows may be significantly greater than if the optimizer had been allowed to optimize for the entire answer set.

If you have a packaged application that uses the call level interface (DB2 CLI or ODBC) it is possible to have DB2 CLI automatically append an OPTIMIZE FOR clause to the end of each query statement using the OPTIMIZEFORNROWS keyword in the db2cli.ini configuration file. For additional information refer to the CLI Guide and Reference manual.

When selecting data from nicknames, results may vary depending on data source support. If the data source referenced by the nickname supports the OPTIMIZE FOR clause, and the DB2 optimizer pushes down the entire query containing the clause to the data source, then the clause is generated in the remote SQL sent to the data source. If the data source does not support this clause, or if the optimizer decides to execute the clause locally (least cost plan), the OPTIMIZE FOR clause is applied locally at DB2. In this case, the DB2 optimizer will continue to give preference to access plans that minimize the response time for retrieving the first few rows of a query, but the options available to the optimizer for generating plans are slightly delimited and performance gains from the OPTIMIZE FOR clause may be negligible.

If both the FETCH FIRST clause and the OPTIMIZE FOR clause are specified, the lower of the two values is used to influence the communications buffer size. The two values are considered independent of each other for optimization purposes. See Using a SELECT-Statement for more information on the interaction between these two clauses.

FETCH FIRST n ROWS ONLY Clause

The OPTIMIZE FOR n ROWS clause does not prevent the retrieval of all qualifying rows. (The total elapsed time to retrieve all qualifying rows may be significantly greater than if the optimizer was allowed to optimize for the entire answer set.)

The FETCH FIRST n ROWS ONLY clause sets the maximum number of rows that can be retrieved from within a SELECT statement. Limiting the result table to the first several rows can improve performance. Only n rows are retrieved regardless of the number of rows there might be in the result table based on a SELECT where this clause is not specified.

DECLARE CURSOR WITH HOLD Statement

When you declare a cursor with the DECLARE CURSOR statement that includes the WITH HOLD clause, any open cursors remain open when the transaction is committed. Further, all locks are released, except locks protecting the current cursor position of open WITH HOLD cursors.

When you declare a cursor with the DECLARE CURSOR statement that includes the WITH HOLD clause, all open cursors are closed when the transaction ends with a ROLLBACK. Further, all locks are released and LOB locators are freed.

The DB2 CLI connection attribute SQL_ATTR_CURSOR_HOLD can be used in CLI applications to achieve the same results. For additional information refer to the "SQLSetStmtAttr - Set Options Related to a Statement" section in the CLI Guide and Reference manual.

If you have a packaged application that uses the call level interface (DB2 CLI or ODBC) it is possible to have DB2 CLI automatically assume the WITH HOLD clause for every declared cursor by using the CURSORHOLD keyword in the db2cli.ini configuration file. Refer to the transaction configuration keywords section of the CLI Guide and Reference for more information.

[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]

[ DB2 List of Books | Search the DB2 Books ]