The explanation tables produced by the EXPLAIN statement allow you to get information about the structure and execution performance of SQL statements. This information can help you analyze how existing database designs perform, or how future designs will perform. Specifically, you can use explanation tables to:
After you complete your design, and construct a prototype, you can use explanation tables to see how well real queries will work against the design. (You can select explain processing explicitly using the EXPLAIN statement, or implicitly using the EXPLAIN(YES) preprocessing parameter or the USING EXPLAIN(YES) option of the CREATE PACKAGE statement.)
Explain processing accepts an SQL statement as an argument, analyzes it, and inserts information about the structure and execution of that statement into the explanation tables, which you must create. (This includes the cost of internally generated statements.) You can then query the explanation tables.
You can select explain processing explicitly by using the EXPLAIN statement, or implicitly using the EXPLAIN(YES) preprocessor parameter or the EXPLAIN(YES) USING OPTION of the CREATE PACKAGE statement.
Note: | Explain processing does not execute the SQL statement. It only explains how the statement will work when you actually execute it. |
You can use the SQL EXPLAIN statement in an application program, the DBS utility, or ISQL to estimate execution performance. For the syntax of the EXPLAIN statement and the structure of the explanation tables produced, refer to the DB2 Server for VSE & VM SQL Reference manual.
To explicitly process the EXPLAIN statement, you must either:
Each time the EXPLAIN statement is executed, rows are appended to the specified explanation tables. Any existing rows are not affected.
When the EXPLAIN statement is issued for INSERT, UPDATE, and DELETE statements that change tables in a referential structure, information is returned not only on the INSERT, UPDATE, and DELETE statements, but also on internally generated statements. Refer to the DB2 Server for VSE & VM Application Programming manuals for more information on internally generated statements.
Each of the explanation tables has a column called QUERYNO (query number). The QUERYNO column has a data type of INTEGER. With the SET QUERYNO clause, you can place an integer value in the QUERYNO columns of the rows inserted by the EXPLAIN statement. Thus, you can use QUERYNO to identify new rows, and to mark them as corresponding to a particular statement.
For integer in SET QUERYNO, you must specify an integer constant that is not preceded by a sign. You cannot use a host variable in the SET QUERYNO clause, even in application programs. However, you can use the "&n" place-holder variables in ISQL.
The SET QUERYNO clause is optional. If you omit it, a NULL value is placed in the fields of the rows inserted by the EXPLAIN statement. If you set the QUERYNO to some initial value, this value identifies the query for which the EXPLAIN is issued. Because QUERYNO is an INTEGER field, an error is returned if its value exceeds 2 147 483 647.
You can enter the EXPLAIN statement from ISQL. When you enter EXPLAIN from ISQL, you must use a character constant if you execute the statement immediately. Alternatively, you can store the EXPLAIN statement by placing it in a routine or by using an ISQL STORE statement. This lets you use a place-holder (for example, &1) for explainable_sql_statement (refer to the DB2 Server for VSE & VM SQL Reference manual for the syntax of the EXPLAIN statement, including explainable_sql_statement). Thus, you can execute EXPLAIN for different SQL statements without having to key in the entire EXPLAIN statement each time. When using this technique, however, you should keep ISQL limits in mind. For example, when the place-holder is in a routine table, the length of input to a parameter is limited by the length of the COMMAND column of the routine table. At most, input to a place-holder can be 254 characters. This number is further reduced if you do not put the "&n" place-holder on a line by itself.
When you enter EXPLAIN from the DBS utility, you must use a character constant for explainable_sql_statement. The utility does not allow the use of host variables or place-holders in any SQL statement.
You can select explain processing implicitly using the EXPLAIN(YES) preprocessing parameter or the EXPLAIN(YES) using option of the CREATE PACKAGE statement. If you select explain processing implicitly, explanatory information is provided for all SQL statements in a package that can be explained and for all internally generated statements. The name of the package and the name of the owner of the package are stored in the explanation tables.
Because you cannot assign a QUERYNO when you select explain processing
implicitly, the section number assigned to the statement being preprocessed is
used as the query number. This number corresponds to the position of
the query in the application. Using the preprocessor listing file, you
can determine the section number assigned to a statement and use it to
determine the corresponding rows in the explanation tables. The
following variable names are used as section numbers for the languages
specified.
Table 9. Variable Names for Section Number (Query Number)
Language | Structure Name | Variable Name |
---|---|---|
ASM | RDIIN | RDISECT# |
COBOL | RDIIN | SQL-SECTION-NUM |
C | RDIIN | SECTION_NUM |
PLI | RDIIN | SECTION_NUM |
FORTRAN | SQLCTL | SQLSTMT |
For further information, see the DB2 Server for VSE & VM SQL Reference and DB2 Server for VSE & VM Application Programming manuals.
Implicit and explicit explain processing insert the same kind of information into the explanation tables during explain processing. The package name and package owner columns of the explanation tables, however, contain information only if implicit explain processing is used.
There is also a difference between implicit and explicit explain processing for the query number of an application. For explicit explain processing, if you do not supply the query number, it is set to NULL. For implicit explain processing, you cannot provide query numbers for SQL statements in the middle of an application, so the section number assigned to the statement when it is processed is used as the query number. You can then use the preprocessor listing file to determine the section number assigned to each statement and the corresponding rows in the explanation tables.
During explicit explain processing, rows are added to the explanation tables when a program is preprocessed, or dynamically repreprocessed. During implicit explain processing, rows are added when a program is preprocessed, but not when it is dynamically repreprocessed. In all situations, explicit explain processing overrides implicit explain processing.
When you processes an application program using the implicit EXPLAIN(YES) option, the preprocessor checks for the existence of the EXPLAIN tables once. If it does not find them, processing is terminated and SQLCODE -649 (SQLSTATE = 42704) is issued.
Each time the preprocessor encounters an explicit EXPLAIN statement, it checks for the existence of the explain tables. If its does not find them the explicit EXPLAIN is not processed and, SQLCODE -204 (SQLSTATE = 42704) or SQLWARNING +204 (SQLSTATE = 01532) is issued. This check is repeated for every explicit EXPLAIN statement that the preprocessor encounters.
There are four explanation tables: REFERENCE_TABLE, STRUCTURE_TABLE, COST_TABLE, and PLAN_TABLE. The definitions of these tables and the EXPLAIN statement syntax are in the DB2 Server for VSE & VM SQL Reference manual. A DBS utility job file, ARISEXP, to generate explanation tables, indexes, and views is shipped with the DB2 Server for VSE & VM product. Instructions for generating the tables using the ARISEXP file are provided at the top of the file. For further information on the contents of the explanation tables, refer to the DB2 Server for VSE & VM SQL Reference manual.
When you execute an EXPLAIN statement, information is placed in your tables, and is independent of any other user's explanation tables. You can review and summarize the information placed in your explanation tables just as you can other tables. However, because explanation tables only insert rows, you also have the responsibility to delete unnecessary information yourself.
In the following descriptions of each table, the term query block is used. A query block is a part of a query. Query blocks are used to distinguish the parts of a subquery. For example, when a query does not involve a subquery, there is only one query block: query block 1. When there is a subquery, there are two query blocks, the outer-level query and the subquery. They are referred to as query block 1 and query block 2, respectively. Because subqueries may be nested within each other, there may be many query blocks in a statement; each query block corresponds to separate (but interacting) parts of the statement.
The SELECT statement in Figure 17 is used in the following descriptions of the explanation tables. This SELECT statement has only one query block.
Figure 17. SELECT Statement for Explanation Table Descriptions
SELECT X.DEPTNAME, Y.FIRSTNME, Y.MIDINIT, Y.LASTNAME, Y.PHONENO FROM DEPARTMENT X, EMPLOYEE Y WHERE X.MGRNO = Y.EMPNO AND X.DEPTNO = Y.WORKDEPT |
Assume that user Smith owns tables DEPARTMENT and EMPLOYEE where:
This table is updated by EXPLAIN COST or EXPLAIN ALL. The information in this table provides the cost estimate of the statement for which the EXPLAIN is issued and for any internally generated statement used to enforce referential integrity. In addition, you can compute the contribution of each subquery (if any) to the total cost estimate of the statement. (To compute the subquery cost estimates, you will need to use information provided by the EXPLAIN STRUCTURE statement.)
For each query block in the statement, EXPLAIN inserts one row into COST_TABLE. The information depends on the existing indexes and catalog statistics. If indexes are added or dropped after you issue EXPLAIN for the statement, then the COST_TABLE entry for the command is not valid.
If you need a description of the columns in the COST_TABLE, refer to the DB2 Server for VSE & VM SQL Reference manual.
The value in COST is referred to as the cost estimate (occasionally referred to as the optimizer cost estimate or the resource cost estimate). All of these terms refer to the same thing: the internal value that the optimizer uses to represent the resource cost of executing an SQL statement for which the EXPLAIN is issued and for any internally generated statement used to enforce referential integrity. The value is a relative value that incorporates I/O requirements with a weighted factor of processor requirements for a query.
Aside from the COST column, there are two other ways to get this value for a given SQL statement. One way is to examine the SQLCA after preparing a dynamically defined SQL statement. The cost estimate is kept in the SQLERRD(4) field. A second way to see the cost estimate is using ISQL. ISQL displays the query cost estimate integer. The integer results from dividing the real internal value by 1000 and adding 1 to it. This produces a number that is easier to grasp. This is a valid technique because the numbers are relative to each other; they do not represent real physical consumption directly.
It follows, then, that it is futile to try to develop an algorithm that directly maps the cost estimate to a real physical unit such as time. Too many other factors are involved (for example, overall system workload). It is best to use the cost estimate as a general indicator.
For the SELECT statement in Figure 17, there would be only one row entered into COST_TABLE, because there is only one query block:
Figure 18. Results of COST_TABLE Query
QUERYNO RINO QBLOCKNO PKGNAME PKGOWNER COST ------- ---- -------- ------- -------- ------------------ 1 0 1 1.4388885498046E+01 TIMESTAMP -------------------------- 1999-08-26-09.49.25.601721 |
This table is updated by EXPLAIN PLAN or EXPLAIN ALL. The information in this table describes the order in which tables are accessed by the statement for which the EXPLAIN is issued and by any internally generated statement used to enforce referential integrity. In addition, the PLAN_TABLE table describes the indexes used to access the tables, and specifies whether indexes alone were used, the methods that the database manager used to do joins, the sorts done as part of runtime processing, and the reasons for the sorts.
As with the COST_TABLE, the PLAN_TABLE results depend on the existing indexes and catalog statistics at the time the EXPLAIN statement is executed. If indexes are added or deleted, then the PLAN_TABLE entry for the statement is not valid.
For each step in the plan determined by the database manager for processing the query, EXPLAIN inserts one row into the PLAN_TABLE. There is one step for each table reference in a query block. There are additional steps if the database manager must perform additional sorts at the end of processing for the query block, or if any internally generated statements are to be processed.
The steps in the plan are ordered by the value of the PLANNO column of PLAN_TABLE, and for each step, the TNAME column identifies the table accessed. The phrase "previous steps of the plan" refers to PLAN_TABLE rows with smaller values of PLANNO. The action described in a step is either a join of a table to those previously joined, or it is a sort. (Joins themselves may involve performing sorts.) The term "composite" refers to the result of all previous steps; the term "new" refers to the new table that is being accessed and joined as part of a particular plan step.
If you need a description of the columns in the PLAN_TABLE, refer to the DB2 Server for VSE & VM SQL Reference manual.
The PLAN_TABLE for the query in Figure 17 is shown below. Because there are many columns in PLAN_TABLE, the display of the table is split to fit on the page:
Figure 19. Results of PLAN_TABLE Query
QUERYNO RINO QBLOCKNO PKGNAME PKGOWNER PLANNO METHOD CREATOR ------- ---- -------- ------- -------- ------ ------ ------- 1 0 1 1 0 SMITH 1 0 1 2 1 SMITH TNAME TABNO ACCESSTYPE MATCHCOLS ACCESSCREATOR ---------- ----- ---------- --------- ------------- DEPARTMENT 1 W 0 SMITH EMPLOYEE 2 I 0 SMITH ACCESSNAME INDEXONLY SORTNEW SORTCOMP SORTN_UNIQ SORTN_JOIN ---------- --------- ------- -------- ---------- ---------- MGRNOI N N N N N PKEYB1H9ZR8CD51W N N N N N SORTN_ORDERBY SORTN_GROUPBY SORTC_UNIQ SORTC_JOIN SORTC_ORDERBY ------------- ------------- ---------- ---------- ------------- N N N N N N N N N N SORTC_GROUPBY TIMESTAMP REMARKS ------------- -------------------------- ------- N 1999-08-26-09.49.25.601721 N 1999-08-26-09.49.25.601721 |
A Type 1 (or nested loop) join is performed on tables EMPLOYEE and DEPARTMENT. The database manager accesses DEPARTMENT as the outer table of the join (the first table accessed), and EMPLOYEE as the inner table of the join.
The row with PLANNO=1 indicates that the database manager accesses DEPARTMENT using the index MGRNOI (which, as it happens, was created on the MGRNO column).
The entry with PLANNO=2 indicates that the database manager has performed an action on the EMPLOYEE table, based upon the conditions included in the query. An index has been generated internally on the primary key EMPNO of the EMPLOYEE table. This index performs the matching of EMPNO (in the EMPLOYEE table) to MGRNO (in the DEPARTMENT table). This index can be used because the value in MGRNO of DEPARTMENT, which must be matched by the EMPNO value of EMPLOYEE. Retrieval of rows from an inner table of a join will often, though not always, use an index on a join column of the inner table.
No sorts are used in the plan for this query. However, if the query had demanded SELECT DISTINCT, instead of SELECT, the plan would have an additional row, with PLANNO=3, which would have TABNO=0, METHOD=3, SORTC_UNIQ='Y' and SORTCOMP='U'.
This table is updated by EXPLAIN REFERENCE or EXPLAIN ALL. The database manager inserts one row in REFERENCE_TABLE for each column referenced in the statement (in certain ways, as explained below) and for any internally generated statement used to enforce referential integrity. Even if the column is referenced more than once for a table, there is still only one row inserted for the column and that row is for the most selective predicate. However, multiple appearances of a table in a query (as when a table is joined to itself) can lead to multiple descriptions of their columns.
One row is entered for each table reference, one for the statement as a whole, and one that indicates the way in which the column is used in the query. For a description of the columns in the REFERENCE_TABLE, refer to the DB2 Server for VSE & VM SQL Reference manual.
For the example statement presented in Figure 17, the new rows entered into REFERENCE_TABLE by EXPLAIN REFERENCE might be:
Figure 20. Results of the REFERENCE_TABLE Query
QUERYNO RINO QBLOCKNO PKGNAME PKGOWNER REFTYPE CREATOR TNAME
------- ---- -------- ------- -------- ------- ------- ----------
1 0 1 SELECT
1 0 1 TABLE SMITH DEPARTMENT
1 0 1 TABLE SMITH EMPLOYEE
1 0 1 COLUMN SMITH DEPARTMENT
1 0 1 COLUMN SMITH DEPARTMENT
1 0 1 COLUMN SMITH DEPARTMENT
1 0 1 COLUMN SMITH EMPLOYEE
1 0 1 COLUMN SMITH EMPLOYEE
1 0 1 COLUMN SMITH EMPLOYEE
1 0 1 COLUMN SMITH EMPLOYEE
1 0 1 COLUMN SMITH EMPLOYEE
1 0 1 COLUMN SMITH EMPLOYEE
TABNO CNAME COLNO FILTER DBSSPRED JOINPRED ORDERCOL
----- ----- ----- ------------------- -------- -------- --------
0 0 0.0E0 0
1 0 0.0E0 0
2 0 0.0E0 0
1 DEPTNAME 2 1.0E+00 N N 0
1 DEPTNO 1 1.111111448837E-01 Y Y 0
1 MGRNO 3 3.125E-02 Y Y 0
2 EMPNO 1 3.125E-02 Y Y 0
2 FIRSTNME 2 1.0E+00 N N 0
2 LASTNAME 4 1.0E+00 N N 0
2 MIDINIT 3 1.0E+00 N N 0
2 PHONENO 6 1.0E+00 N N 0
2 WORKDEPT 5 1.1111110448837E-01 Y Y 0
GROUPCOL UPDATECOL TIMESTAMP
-------- --------- -------------------------
0 1999-08-26-09.49.25.601721
0 1999-08-26-09.49.25.601721
0 1999-08-26-09.49.25.601721
0 1999-08-26-09.49.25.601721
0 1999-08-26-09.49.25.601721
0 1999-08-26-09.49.25.601721
0 1999-08-26-09.49.25.601721
0 1999-08-26-09.49.25.601721
0 1999-08-26-09.49.25.601721
0 1999-08-26-09.49.25.601721
0 1999-08-26-09.49.25.601721
0 1999-08-26-09.49.25.601721
These rows indicate that the statement is a SELECT statement with no subqueries, joining two tables, SMITH.DEPARTMENT and SMITH.EMPLOYEE. The columns MGRNO, DEPTNO from DEPARTMENT, and the columns EMPNO, WORKDEPT from EMPLOYEE appear together in the 'WHERE' clause (identified by a Y in the DBSSPRED column), permitting indexes to be used. These columns are the JOIN columns (identified by a Y in the JOINPRED column). FILTER may be misleading, because the filtering depends on the order in which tables are processed.
|
This table is updated by EXPLAIN STRUCTURE or EXPLAIN ALL. The database manager inserts one row in STRUCTURE_TABLE for each query block in the statement.
If you need a description of the columns in the STRUCTURE_TABLE, refer to the DB2 Server for VSE & VM SQL Reference manual.
If the following SELECT statement is issued, only one row is entered in STRUCTURE_TABLE, as shown in Figure 21, because there is only one query block.
EXPLAIN ALL FOR SELECT * FROM EMPLOYEE
Figure 21. Results of the STRUCTURE_TABLE Query
QUERYNO RINO QBLOCKNO PKGNAME PKGOWNER ROWCOUNT TIMES
------- ---- -------- ------- -------- -------- ---------
0 1 32 0.0E0
PARENT ATOPEN TIMESTAMP
------ ------ --------------------------
0 N 1999-08-26-09.49.26.001720
A more complicated example is provided in the following sections, where we show how to separate the costs for individual query blocks using STRUCTURE_TABLE and COST_TABLE together.
A query may have subqueries, which in turn may have subqueries. The database manager separates this tree of subqueries into pieces, called query blocks. Each query block has its own tables, columns, and rowcount. EXPLAIN STRUCTURE, COST lets you look at combined information, or to separately examine the information for each query block.
The PARENT field gives the logical parent for each query block, which is not always obvious from the query itself. Sometimes, a query block has no correlation to the query where it immediately appears, so it is executed only once, when some ancestor query block is first entered, rather than many times.
The following example has a large number of query blocks, but references only one table (many times). It illustrates the meanings of PARENT and ATOPEN, as well as the method for decomposing COST values into separate costs for query blocks.
QBLOCKNO --------- SELECT * FROM DEPT X *** 1 *** WHERE DNAME > ALL (SELECT DNAME FROM DEPT *** 2 *** WHERE X.DNO = DNO AND LOC = 32) AND DNO = (SELECT DNO FROM DEPT Y *** 3 *** WHERE MGR = (SELECT MGR FROM DEPT Z *** 4 *** WHERE DNAME IN (SELECT DNAME FROM DEPT *** 5 *** WHERE NEMP = X.NEMP) AND DNO = (SELECT DNO FROM DEPT W *** 6 *** WHERE NEMP > Z.NEMP AND LOC IN (SELECT LOC FROM DEPT *** 7 *** WHERE DNAME = Y.DNAME)) AND LOC = 32 ) AND Y.NEMP < (SELECT AVG(NEMP) FROM DEPT *** 8 *** WHERE Y.MGR = MGR )); |
Here are results from STRUCTURE_TABLE for this query. (ROWCOUNT is not shown.)
Figure 22. Example STRUCTURE_TABLE
QUERYNO RINO QBLOCKNO PKGNAME PKGOWNER TIMES PARENT ------- ---- -------- ------- -------- ------ ------ ? 0 1 9.000 0 ? 0 2 0.500 1 ? 0 3 9.000 1 ? 0 4 0.500 3 ? 0 5 2.000 3 ? 0 6 1.500 4 ? 0 7 1.000 4 ? 0 8 1.000 3 ATOPEN TIMESTAMP ------ -------------------------- N 1999-08-26-09.49.27.011719 N 1999-08-26-09.49.27.011719 N 1999-08-26-09.49.27.011719 N 1999-08-26-09.49.27.011719 Y 1999-08-26-09.49.27.011719 N 1999-08-26-09.49.27.011719 Y 1999-08-26-09.49.27.011719 N 1999-08-26-09.49.27.011719 |
Note: | A ? indicates a NULL value. |
Although query block 5 is physically nested within query block 4, it references neither Y nor Z. Hence, its value can be computed once each time query block 3 is first entered, with a particular X row.
Query block 5 is executed once, at open time of query block 3.
Query block 7 is physically within query block 6, but can be evaluated once when the database manager first enters query block 4. Query block 7 does not need to be re-executed again until new values for rows outside query block 4 are required. This is different from query block 2, for example, which must be executed once for each row of query block 1, rather than just once when execution of query block 1 begins.
TIMES may be a fraction (and may be less than 1) because it represents the estimated number of times, per execution of a query block, that its dependent query blocks will be executed.
The most important use of this query block breakdown involves computation of costs for individual query blocks, instead of costs for the query as a whole. This is described next.
Here are the COST_TABLE entries for the query in the preceding section:
Figure 23. Results of COST_TABLE Query
QUERYNO RINO QBLOCKNO PKGNAME PKGOWNER COST ------- ---- -------- ------- -------- --------- ? 0 1 3021.000 ? 0 2 10.000 ? 0 3 324.000 ? 0 4 24.000 ? 0 5 4.000 ? 0 6 6.000 ? 0 7 10.000 ? 0 8 10.000 TIMESTAMP -------------------------- 1999-08-26-09.49.27.011719 1999-08-26-09.49.27.011719 1999-08-26-09.49.27.011719 1999-08-26-09.49.27.011719 1999-08-26-09.49.27.011719 1999-08-26-09.49.27.011719 1999-08-26-09.49.27.011719 1999-08-26-09.49.27.011719 |
The cost displayed is the total cost for each query block, including costs associated with all query blocks that are below it in the logical tree of query blocks. Thus, the cost of executing the statement is approximately 3021. (For simplicity in the calculations that follow, the values are shown as integers, but they need not be.)
The cost for executing the entire statement helps you understand the effect of the statement on system load, but it hides the blocks of the query that are contributing the most to the cost of the query.
A more useful set of figures might be those listed in the following table:
Note: | The following table is only an example. It is not stored in the
database, and INDIVCOST and MULTCOST are not columns in COST_TABLE.
|
INDIVCOST represents the cost of one execution of the individual query block itself, not including the costs of any of its subqueries. MULTCOST not only counts the cost of the individual query block, but multiplies INDIVCOST by the number of times the query block is expected to be executed in the query. This is a better measure of the cost importance of the query block than either COST or INDIVCOST. Notice that the MULTCOST column adds up to COST(1), the total cost of the entire query.
Thus, the following formula can be used to derive INDIVCOST:
COST(I) = INDIVCOST(I) + the sum, over all blocks J that have I as logical parent, of either: TIMES(I) * COST(J) if J is not done AT OPEN of I, or COST(J) if J is done AT OPEN of I. |
For example:
COST(3) = 324 = INDIVCOST(3) + 9*24 + 4 + 9*10,
so INDIVCOST(3) is 14.
Here is another example:
COST(4) = 24 = INDIVCOST(4) + 0.5*6 + 10,
so INDIVCOST(4) is 11.
This formula can be used to derive MULTCOST from INDIVCOST:
MULTCOST(I) = INDIVCOST(I) * the product of TIMES(I) for all logical ancestors of I, if I is not done AT OPEN of its parent, or INDIVCOST(I) * the product of TIMES(I) for all logical ancestors of I EXCEPT its parent, if I is done AT OPEN of its parent. |
For example:
MULTCOST(7) = 10 * TIMES(3) * TIMES(1) = 810.
We do not multiply TIMES(4) into that product, because 4 is 7's immediate parent, and 7 is done AT OPEN of 4.
This demonstrates that the most important component of the estimated cost comes from block 4, and you might choose indexes that make processing this query block cheaper. By inspecting the query, or from the rows in REFERENCE_TABLE, you might decide that indexes on one of DNAME, DNO or LOC might reduce the cost of processing.
Because ROWCOUNT (estimated number of rows in result, for queries, or of affected rows, for updates and deletes) is stored in STRUCTURE_TABLE, it is easy to gauge the estimated size of responses. If your structure includes a DELETE CASCADE rule, EXPLAIN will include the cost of the cascading effects of a DELETE. This can help you understand whether requests are reasonable, and whether the statistics in the database catalog tables that estimate ROWCOUNT seem up-to-date.
ROWCOUNT can also help determine space requirements when responses are being stored in program data structures. However, ROWCOUNT, like all estimates made by the system, is neither precisely accurate, nor even an upper bound on the actual number of rows in the response.
A systematic analysis of many statements in the workload of the system can help the administrator plan the access paths for the database. The analysis should consider costs when different combinations of indexes exist. It should also consider the costs of performing updates, which are not reflected in the COST column of COST_TABLE, and limits on space for indexes.
You may load the database with tables ordered on certain columns. Indexes on such columns (called CLUSTERING indexes) enable the database manager to maintain this ordering. Because the ordering of rows within tables strongly affects the costs of execution, it may be worthwhile to reload the database to improve performance. For each statement, join, ORDER BY, and GROUP BY columns may be good candidates for ordering. Also, tables that are often joined might be interleaved on join columns when the database is loaded.
If the cost of executing a statement (as determined by EXPLAIN COST, or by running the statement) is higher than expected, a user or administrator may want to look at the procedure that the database manager chose to execute that statement. Building additional access paths or altering the layout of tables may be necessary to achieve good performance for the statement. For example, if a relation scan, that is ACCESS TYPE='R', is performed on a large table, it may be better to build an index on some column of that table; EXPLAIN REFERENCE provides hints about which indexes might help. Adding new indexes makes updates more expensive, so this decision must be considered carefully.
The PLAN_TABLE can also help the administrator determine which indexes are not being used, so that he or she may decide which indexes might be dropped. This assumes that the administrator knows not only the significant statements in programs, but also the significant statements issued by users directly at their terminals.