DB2 Server for VSE & VM: Performance Tuning Handbook

DASD Storage

How you manage DASD storage affects performance in four ways:

How Storage Is Divided: How you divide a limited amount of storage between indexes and data, and among dbspaces and among storage pools determines to a large degree how each will perform in different situations.
Wasted Storage: Wasted storage in itself may not affect the performance of the system that is using it, but it may represent a resource that could be used to improve performance elsewhere.
Distributing DASD I/O: How well you balance the demand for DASD I/O across several DASD devices, controllers and channels can affect how fast the database manager can retrieve information from DASD.
Running out of Storage: While running out of storage can disrupt your users because you are forced to bring down the application server to add storage, just getting close can degrade performance. (If you reach the application server's short on storage level you trigger unnecessary SOSLEVEL checkpoints, refer to Short on Storage Cushion.)

The directory, logs, and dbextents are VSAM Entry Sequenced Data Sets (ESDS) with a control interval size of 512 bytes for the directory and 4096 bytes for the logs and dbextents. The database manager uses VSAM Control Interval processing to read and write records to the VSAM ESDS.

In VM

The directory, logs, and dbextents are CMS reserved minidisks with a blocksize of 512 bytes for the directory and 4096 bytes for the logs and dbextents (the directory may have a blocksize of 4096 bytes, if Data Spaces Support is used). These minidisks have CMS-like files that are in a format to be used with the IUCV *BLOCKIO I/O system that reads and writes records to these files. These minidisks are called reserved because they have been processed by the CMS RESERVE command. It specifies that the minidisk consists of a single CMS file, which is allocated using all available disk blocks. This CMS file cannot be processed by most CMS file system commands and must never be modified, except by the database manager.

Mapping of Dbspaces to DASD

Logical dbspaces must be mapped to physical dbextents on DASD. The database manager does this by maintaining page map table(s), for each dbspace, which map a given dbspace page to its location on DASD. The page map table is stored in the DB2 Server for VSE & VM directory. There can be multiple page map tables per dbspace. Each page map table block is equivalent to 128 pages in a dbspace.

Logical To Physical Page Relationships

Physical page slots in the storage pool are allocated to the dbspaces dynamically upon first reference. Once a logical page has had a physical page slot allocated to it, it will continue to have a physical page allocated, even if empty, until the dbspace is dropped.

Storage Pools

A storage pool is a collection of one or more dbextents, which can be used to control the distribution of the database across DASDs. The maximum number of storage pools for a given database is specified by the database generation keyword MAXPOOLS. A storage pool does not exist until a dbextent is assigned to it. Dbspaces are assigned to a given storage pool when they are defined. That means when physical page slots are allocated to the dbspace, they are allocated from the storage pool to which the dbspace belongs.

In addition, if the storage pool contains more than one dbextent, the database manager allocates pages in a storage pool in sequence, usually allocating all the pages in one dbextent before using the next dbextent. With the DB2 Server DSS Feature, the database manager can distribute pages evenly across all the extents, refer to Striping.

Managing Storage Pool Space

Short on Storage Cushion

The short on storage (SOS) cushion helps you avoid completely filling a storage pool. If the database manager is running:

In SUM with LOGMODE not equal to N or
In MUM

and the percentage of space available in one pool falls below the SOS level, the database manager performs a checkpoint to release shadow pages (refer to Shadow Pages). If this does not release enough pages to fall below the SOS level, a warning message is sent to the operator. If you are already short on storage and need more storage in a pool, refer to Running out of Dbspace Pages.

Tuning Parameter (SOSLEVEL)

While it is acceptable to reach the SOSLEVEL initialization parameter occasionally, do not let any of your storage pools hover around it. SOSLEVEL initiated checkpoints are unnecessary overhead. If they occur frequently, it is a good sign that you should either free space in the overloaded pools, or increase their size by adding dbextents.

Do not just lower the SOSLEVEL to avoid checkpoints. If you have less than 10% free space in a storage pool the database manager will initiate a checkpoint during a rollback even if you set SOSLEVEL below 10%.

Instead, set SOSLEVEL to at least 15% and try to keep at least 25% free space in each storage pool. This ensures that even if you accumulate a large number of shadow pages in a pool, the database manager will not initiate unnecessary checkpoints.

Performance Indicator (SHOW POOL)

Use the SHOW POOL operator command to determine what percentage of each storage pool is full. If the free space in a pool falls below the SOSLEVEL parameter, the SHORT ON STORAGE flag appears in the report for that pool. (Refer to page ***.)

Also watch the CHKPOINT counter. If you notice an excessive number of checkpoints occurring during insert or update transactions, the database manager may be doing the following:

Reaching the SOSLEVEL and performing a checkpoint.
The checkpoint releases just enough shadow pages for the pool to fall below the SOSLEVEL.
Subsequent processing quickly refills the pool to the SOSLEVEL and another checkpoint is taken.

The database manager may spend so much time processing SOSLEVEL checkpoints that it can perform little useful work. Changing the SOSLEVEL will not help this problem. Instead, add storage to the pool, refer to Running out of Dbspace Pages.

Types of Pages

There are four types of pages that can reside in a dbspace:

Header Pages: These pages contain an inventory of all the dbspace attributes, tables and indexes created in the dbspace.
Data Pages: These pages contain table rows that may be from several different tables in the dbspace.
Index Pages: These pages contain index entries. Each page contains information for one specific index on one specific table.
Shadow Pages: These pages are used to ensure that the database manager can reconstruct changes to the database after a system failure, refer to Shadow Pages.

Number of Header Pages

Because there are never many header pages in a single dbspace, never more than eight, they do not represent a significant impact on performance. We suggest that they remain at the system default of eight.

Proportion of Index to Data and Header Pages

The amount of space you reserve for index pages, depends on how many indexes you expect to create and on the number and size of columns included in the indexes. You can use the following as a guideline:

Read-Only Data: Since many indexes are recommended for read-only data, you should reserve at least the default of 33% and as much as 50% for the pages in a dbspace for index pages.
Update Intensive Data: You can reserve less than the default of 33% of the pages in a dbspace for index pages, since you may not use as many indexes for this type of data. (It is expensive to update indexes every time data is updated, so it is suggested that you use fewer indexes with this data.)

If you are unsure whether your data is read-only or update intensive, use the default of 33% index pages.

Tuning Parameter (PCTINDEX)

You can set the proportion of index to data and header pages in a dbspace when you acquire it using the PCTINDEX parameter. For example, the following statement acquires a dbspace and reserves 50% of its pages as index pages:

   ACQUIRE PUBLIC dbspace NAMED test_dbspace (PCTINDEX=50)

Performance Indicator (PCTINDX)

To determine the current percentage of reserved index pages in a dbspace, look in the PCTINDX column of the SYSTEM.SYSDBSPACES catalog table for your dbspace.

Free Space in Data Pages

You can reserve a percentage of each data page for updates that make the changed row longer than it was before. This free space is not used for inserts. You can reclaim the free space for inserts through an ALTER DBSPACE statement. The percentage of free space you choose will depend on the type of activity being carried out on the data in the dbspace:

High Insert/Low Update Activity: This is the situation where there will be few updates, or all columns are fixed length and non-nullable in the tables. Here, you would set the percentage of free space to a high value before loading the data; then lower it to a low value. The difference between the original value and the final value can then be used by insert activity.
Low Insert/High Update Activity: In this situation, PCTFREE should be set to a low to medium value, depending on the likelihood of updated rows increasing in length. (Increase PCTFREE in proportion to the likelihood of increasing row length.) The space saved by PCTFREE will be used by the update activity only if the update increases the size of the row and the free space will accommodate the new row.
Low Insert/Low Update Activity Or Read-Only Data: Read-only data is data that is loaded into a dbspace and then never modified or updated, only retrieved using query statements. In this situation, set PCTFREE to a low value or zero, before you load any data into the dbspace.
High Insert/High Update Activity: In this situation, set PCTFREE to a high value while you load data into the database and then lower it. This would allow space for use by both update and insert activities.

One purpose of PCTFREE is to minimize overflow because of row expansion. When UPDATE commands are executed on an existing row and the length of the row increases, the row could expand into the free space reserved with PCTFREE. If the expansion exceeds the free space, the page becomes full, and it causes an overflow. The row is relocated to a new page and a pointer chaining to the new location is set in the old page. If the row has to be moved again, the pointer is set to mark the newest location. Therefore, the database manager never reads more than two pages for one row.

The other purpose of PCTFREE is to reserve space on a page when data is loaded. After loading, PCTFREE can be lowered to allow the free space to be used for inserts (to help keep the data clustered).

Tuning Parameter (PCTFREE)

You can set the percentage of space on each page that is kept free when data is inserted in the dbspace, when you acquire it using the PCTFREE parameter. For example, the following statement acquires a dbspace and reserves 20% of the space on each page for inserts:

   ACQUIRE PUBLIC dbspace NAMED test_dbspace (PCTFREE=20)

You need to know the PCTFREE setting for a dbspace before you can calculate the number of rows you can effectively store on a single data page. For a complete description of how to calculate this, refer to "Estimating the Number of Data Pages Required" in the DB2 Server for VSE & VM Database Administration manual.

Performance Indicator (FREEPCT)

To determine the current percentage of space on each page that is kept free when data is inserted in the dbspace, look in the FREEPCT column of the SYSTEM.SYSDBSPACES catalog table for your dbspace.

Performance Indicator (AVGROWLEN)

To determine average length of the rows in a table, look in the AVGROWLEN column of the SYSTEM.SYSCATALOG catalog table.

Performance Indicator (NOVERFLOW)

To determine how many rows are overflowing onto new pages, look in the NOVERFLOW column of the SYSTEM.SYSCATALOG catalog table. As a rule of thumb, if the number of overflow rows in a table (NOVERFLOW) exceeds 5% of the total number of rows in the table (ROWCOUNT), it is probably time to reorganize the table. Refer to Reorganizing Data.

If you decide to reorganize the table because of this, you may also want to alter the dbspace to give it a larger PCTFREE value.

Free Space in Index Pages

You can reserve a percentage of space in each index page for future index entries, which allows index maintenance to take place without splitting of index pages. Its default is 10 percent, which is a good value for most purposes. If you expect much insert or update activity after the creation of the index, you might want to override the default by setting the percentage to a higher value. If you expect no insert or update activity after the creation of the index, you might want to override the default by setting the percentage to zero. Usually, a low value (5% to 10%) is a good choice when creating an index, as this allows enough room to accommodate a low level of maintenance.

Tuning Parameter (PCTFREE)

You can set the percentage of space in each index page for future index entries, when you create it using the PCTFREE parameter. For example, the following statement creates an index and reserves 20% of the space on each page for future entries:

   CREATE INDEX test_index ON test_table (test_column) PCTFREE=20

You can change a current PCTFREE value by either dropping the index and recreating it with a new PCTFREE, or you can reorganize it with the DBS Utility. For example, the following command will reorganize index index_number_one created by smith and give it a PCTFREE value of 50:

     REORGANIZE INDEX (smith.index_number_one) PCTFREE=50

For more information on reorganizing indexes, refer to Reorganizing Fragmented Indexes.

Performance Indicator (IPCTFREE)

To determine the current percentage of space in each index page for future index entries, look in the IPCTFREE column of the SYSTEM.SYSINDEXES catalog table for your index.

Shadow Pages

Shadow pages are used whenever you make changes to your database. They use space in a storage pool that is only released during a checkpoint. If you are not careful to leave enough free space in your pools, shadow pages can fill them before the space is reclaimed at the next checkpoint. This is true even if you are only modifying rows and not adding new ones.

Each permanent (not internal) dbspace page has two entries in the page map table. One points to the current page while the other points to the shadow page. The current page contains any updates made to the page since the last checkpoint. The shadow page contains the original page as it was at the time of that checkpoint. (See Choosing the Checkpoint Interval for a discussion of checkpoints.) If there have been no changes to the table since the last checkpoint, both entries point to the current page.

The database manager uses this system to reconstruct changes to the database after a system failure. When the database manager is restarted after a system failure, it will use the page map table entries that point to the shadow pages. This effectively resets the database to its state at the last checkpoint. The LOG is then used to re-apply updates for LUWs committed after the last checkpoint.

When the database manager updates a page the following occurs:

A new physical page is allocated from a storage pool. This uses one physical 4KB page in a storage pool. It does not deplete the available data pages of the dbspace.
The current page map table entry is set to the new page location.
The new page is created in the local buffer pool.

At the next checkpoint the following occurs:

The new page in the buffer pool is written to the new physical location in a storage pool and the buffer page is released for reuse.
The shadow page map entries are set equal to the current page map entries, and the physical pages in the shadow page map entries that have been changed are released.

Note:

Do not confuse shadow page recovery with rollback work processing. Shadow pages are NOT released during a ROLLBACK. During a rollback the contents of the log are read and any changes to the database are undone. For example the database manager will undo a CREATE TABLE with a DROP TABLE. If a ROLLBACK were accomplished by falling back on shadow pages, you could not recover if a system failure occurred during ROLLBACK processing.

Determining the Number of Shadow Pages in Use

If you want to know how many pages are used by a specific update transaction, you can compare the number of PAGES USED (SHOW POOL) before and after the transaction. For example, consider a storage pool with two dbextents. Force a checkpoint (drop a dbspace created for the purpose), and then enter a SHOW POOL command:

+--------------------------------------------------------------------------------+
|show pool                                                                       |
|                                                                                |
|POOL NO.  1:     NUMBER OF EXTENTS = 2                                          |
|                                                                                |
|EXTENT   TOTAL    NO. OF      NO. OF      NO. OF     %                          |
| NO.     PAGES  PAGES USED  FREE PAGES  RESV PAGES  USED                        |
|   1       855        245         610                28                         |
|   2       855          0         855                 0                         |
|TOTAL     1710        245        1465          20    14                         |
|ARI0065I Operator command processing is complete.                               |
|
·
·
·
                                                                             |
+--------------------------------------------------------------------------------+

This pool has 245 PAGES USED. This total includes data, header, and pages index. However, since you have just forced a checkpoint it does not include any shadow pages.

If you now perform a transaction (for example an UPDATE statement) you can determine how many shadow pages it uses by reissuing the SHOW POOL operator command. For example, enter another SHOW POOL after the checkpoint:

+--------------------------------------------------------------------------------+
|show pool                                                                       |
|                                                                                |
|POOL NO.  1:     NUMBER OF EXTENTS = 2                                          |
|                                                                                |
|EXTENT   TOTAL    NO. OF      NO. OF      NO. OF     %                          |
| NO.     PAGES  PAGES USED  FREE PAGES  RESV PAGES  USED                        |
|   1       855        249         606                29                         |
|   2       855          0         855                 0                         |
|TOTAL     1710        249        1461          20    14                         |
|ARI0065I Operator command processing is complete.                               |
|
·
·
·
                                                                             |
+--------------------------------------------------------------------------------+

This time the pool has 249 PAGES USED. This means that your transaction used 4 shadow pages (249-245).

If you force another checkpoint, the database manager will now release all the shadow pages in the pool and reclaim the space. (There will be 245 PAGES USED and 610 FREE PAGES.)
Note: The above procedure will not be accurate if the transaction performs enough database modifications to cause a checkpoint to occur.

Running out of Dbspace Pages

If you have run out of pages in a dbspace it is because of one of two conditions, either the logical dbspace is full or the storage pool to which is assigned no longer has any unallocated physical pages.

Storage Pool Full

If all the pages in a storage pool have been allocated, reorganize the dbspaces allocated to it by dropping and recreating (reorganizing) them. This will reclaim space wasted because of fragmentation, refer to Reorganizing Data. If this does not reclaim enough space, you must add a dbextent to the pool. For instruction on how to do this, refer to the DB2 Server for VM System Administration or the DB2 Server for VSE System Administration manuals.

You cannot make more pages available by deleting tables or rows in another dbspace using the same pool. Deleted data pages are not returned to their pool. After a data page in a storage pool has been assigned to a specific dbspace, it cannot be used by another dbspace using the same pool until the entire dbspace is dropped.

You can use the SHOW DBEXTENT operator command (refer to page ***) to monitor the storage available in each storage pool, and you can use the SHOW POOL operator command (refer to page ***) to monitor the number of pages available in each dbextent in the pool.

Dbspace Full

A dbspace cannot be extended after it is defined (either during initial database generation or when it is added to a storage pool). Your only choices are to delete rows or tables in the dbspace itself, or unload the contents of the dbspace and reload them into a new dbspace that is larger than the original.

Alternatively, just over allocate the dbspace when you acquire it (refer to the ACQUIRE dbspace command in the DB2 Server for VSE & VM SQL Reference manual). A dbspace is only a logical allocation of space in the form of directory page tables. You will not actually consume the total number of pages in the storage pool that you defined for all the dbspaces in the pool. Define the size of dbspace based on how large they may become, but define the size of the storage pool based on how much storage you need right now, which includes shadow pages. As your tables grow and you need more pages, just add dbextents to the pool.
Note: The amount of storage you need right now includes space for shadow pages.

You can use the SHOW DBSPACE operator command (refer to page ***) to monitor the number of header, data, and index pages allocated to the dbspace and the percentage of each actually in use.

Shadow Pages

You may find that even though you have not added a significant number of new rows to a dbspace it may become full. This occurs because every time you modify a existing index or data page (or create a new one) a shadow page is created. These pages require additional storage that is not reclaimed until the next checkpoint, refer to Shadow Pages. To avoid this problem, ensure that there is enough free space in your storage pools to accommodate shadow pages. You will usually require between 15% and 25% free space measured immediately after a checkpoint. For example:

+--------------------------------------------------------------------------------+
|show dbextent                                                                   |
|                                                                                |
|POOL     TOTAL    NO. OF      NO. OF      NO. OF     %    NO. OF                |
|NO.      PAGES  PAGES USED  FREE PAGES  RESV PAGES  USED  EXTENTS SOS           |
|   1      1710     1410         300         20       83      2                  |
|FREE    268626                                                                  |
|ARI0065I Operator command processing is complete.                               |
+--------------------------------------------------------------------------------+

This dbextent has 17% free space (100-83). You may want to add space to ensure that shadow pages will not become a storage problem.

Ever Increasing Index

One reason for running out of index pages is that you are using an "ever increasing index". For example, consider a table where you only keep data for three months. Every month you delete any rows that were created more than three months earlier. To keep track of the creation dates, you use a date column, or timestamp and create an index on that column.

Unfortunately in this example, even though you delete old rows, the pages that contain their index keys remain allocated to the table. They are not released for reuse. They also remain allocated to the same range of values (or dates), so in this example, they may never be reused. For example, if one index page contained keys for a range of dates from March 1, 1996 to April 14, 1996, it will only ever be reused for that range.

If you have this type of index, you must constantly monitor the percentage of free index pages in the dbspace. You can use the SHOW DBSPACE operator command, refer to Proportion of Available Pages.

To recover the wasted storage used by an ever increasing index, you must reorganize it. Refer to Reorganizing Fragmented Indexes.

Data Clustering

Clustered Indexes

You can say that an index is clustered if the data is logically stored in an order which closely matches the sequence of the index. That means that, ideally, when you retrieve the rows following the order of a clustered index, the database manager can do so by looking at a minimum number of pages.

Consider the following; all the rows in a table are retrieved in the sequence of an index. As the database manager retrieves each row, it counts the number of times it needs to access a different page than the one it is currently using.

In the best case, the number of pages accessed is exactly equal to the number of pages occupied by that table within the dbspace. A data page is read, all the rows of the subject table in that page are retrieved, and then the next page is read. In this case, the pages are read sequentially - each page read only once.

Remember that saying an index is clustered really means that the table is clustered relative to the index. If the database manager can use the index to sequentially retrieve the rows in the table by looking at a minimal number of data pages, the index is clustered. Another index acting on the same data may or may not be considered clustered.

The Clustering Index

The first index created on a table is, by default, the clustering index. It inserts new rows into data pages so that as many pages as possible are clustered relative to the clustering index, refer to Clustered Indexes.

Default Clustering versus Clustering Index Strategy

When data is inserted into a table, there are two strategies for finding a place for the data in the dbspace: default logic and clustering index logic. Essentially the default logic places any new rows at the end of the table, while the clustering index logic places a new row in index sequence, as much as possible. While the clustering strategy tries to keep a clustering index clustered, the only way to ensure that it is completely clustered is to reorganize the data, refer to Reorganizing Data.

The default logic strategy is used if a clustering index is not available (indicated by a "D" in the CLUSTERTYPE column in SYSTEM.SYSCATALOG for the table). This strategy uses the value in the CLUSTERROW column in SYSTEM.SYSCATALOG for the table to determine the starting point to look for available space for the insert. The value in CLUSTERROW is a pointer to the end of the table. If the value in CLUSTERROW is significantly incorrect, the database manager has to do extra work to find a page that has sufficient free space to hold the row to be inserted. The value of CLUSTERROW can be significantly incorrect if UPDATE STATISTICS has not been executed recently or an application program that is doing the insert has not been preprocessed (prepped) recently. Because a preprocessed program that inserts with the default logic stores the value of CLUSTERROW in the package, you must periodically preprocess this kind of program to update the CLUSTERROW value in the package.

The clustering index strategy is used if a clustering index is available (indicated by a "I" in the CLUSTERTYPE column in SYSTEM.SYSCATALOG for the table). This strategy attempts to place the new row on the same page as rows with similar key values. This determines the starting point to look for available space for the insert. If there is no available space on the pages at or near this starting point then the database manager must do additional work to find a page that has sufficient free space to hold the row to be inserted. Insufficient free space can occur because no free space was established for the dbspace or because inserts have used all the free space. If you reorganize the dbspace, refer to Reorganizing all the Tables in a Dbspace, you can establish free space for inserts.

When you create a table, CLUSTERTYPE is set to "D" and CLUSTERROW is set to zero. When you create the first index on a table, CLUSTERTYPE is set to "I". If you reorganize the clustering index (refer to Reorganizing Fragmented Indexes) it will remain the clustering index. If you drop the clustering index, CLUSTERTYPE is set back to "D". To establish a different index as the clustering index you usually drop all indexes on the table, create the new clustering index as the first index, and then create any other indexes. Refer to Reorganizing a Single Table and Reorganizing all the Tables in a Dbspace. You can also change the clustering index by updating the SYSTEM.SYSINDEXES catalog table, refer to Changing the Clustering Index without Dropping Indexes.

How Indexes Become Unclustered

Indexes become unclustered when:

A significant number of rows were added to a table since the clustering index was first created or since the data was last reorganized
And there was insufficient free space available to put the rows into their optimal locations.

You can increase the number of rows that you can add before the index becomes unclustered by increasing the PCTFREE setting when you reorganize your data. Refer to Reorganizing a Single Table.

Identifying Unclustered Indexes

Deciding whether an index is clustered requires some judgement. First, you need to execute an UPDATE STATISTICS statement against the table the index belongs to. Second, you need to look at the CLUSTERRATIO and the CLUSTER column in the SYSTEM.SYSINDEXES catalog table.

The CLUSTERRATIO value is used by the optimizer to choose a suitable index for access path selection. This value represents a percentage, with the two decimal places implied. The value is calculated by:

                       ROWCOUNT - PAGE JUMPS
CLUSTERRATIO = 10000 * ---------------------
                       ROWCOUNT - PAGE COUNT
 
where: PAGE COUNT = the number of pages the table occupies
       PAGE JUMPS = the number of times a different data page is
                    referenced to access all the data in the table
                    in index order

The CLUSTERRATIO value ranges between 0 and 10000, and indicates the percentage of time that the table's row, when retrieved using that index, are in logical page sequence.

The CLUSTER value, in addition to giving a general idea about whether the index is clustered, is also used to identify the clustering index for the table.

Table 2. CLUSTER values

CLUSTER Value Clustered Clustering
F Yes Yes
C Yes No
W No Yes
N No No

CLUSTER Value	Clustered	Clustering
F	Yes	Yes
C	Yes	No
W	No	Yes
N	No	No

The CLUSTER column will show that an index is not clustered if the following is true:

         PAGE JUMPS
110% <   ----------- x 100
         PAGE COUNTS

(The number of jumps per page is greater than 1.1.)

Clustering VIEW

You can include all the important information about clustering and indexes in one VIEW. For example, the VIEW should contain the following information:

The name of the index and its creator
The CLUSTERRATIO of the index
The CLUSTER value of the index
The number of rows in a table that the index acts on
The number of pages in the table that the index acts on

For example:

+--------------------------------------------------------------------------------+
|SELECT i.iname,                                                                 |
|       i.clusterratio,                                                          |
|       i.cluster,                                                               |
|       t.rowcount,                                                              |
|       t.npages                                                                 |
|       FROM system.sysindexes i, system.syscatalog t                            |
|       WHERE t.tname = i.tname and                                              |
|             t.creator = i.creator                                              |
+--------------------------------------------------------------------------------+

to help you determine if your index is clustered, your view should also include:

The number of jumps
The number of jumps per page.

While the CLUSTERRATIO and CLUSTER values are very useful in determining how clustered a index is, you may find it useful to see how many jumps the database manager makes for each page it reads. Remember each additional jump per page represents an unnecessary I/O. You may also want to compare the number of jumps per page to the number of rows per page. Unfortunately there is no concrete rule you can use to decide when an index is unclustered. However, the more information you have available the better you will be able to get a "feel" for the state of the index.

To calculate the number of jumps, rearrange the clusterratio calculation to solve for jumps instead of clusterratio. For example:

                         CLUSTERRATIO
PAGE JUMPS = ROWCOUNT - ------------- X ( ROWCOUNT - PAGE COUNT)
                            10000

If the page jumps calculation was included in a SELECT with the name of the index, the SELECT would look like this:

+--------------------------------------------------------------------------------+
|SELECT i.iname,                                                                 |
|       (t.rowcount-( i.clusterratio/10000.0*(t.rowcount-t.npages)))             |
|       FROM system.sysindexes i, system.syscatalog t                            |
|       WHERE t.tname = i.tname and                                              |
|             t.creator = i.creator                                              |
+--------------------------------------------------------------------------------+

The following SELECT statement:

Combines the previous two SELECT statements and adds the jumps per page ratio.
Orders the results by the jumps per page ratio, so you can see the indexes with the worst ratio first
Excludes indexes created against empty tables.

+--------------------------------------------------------------------------------+
|SELECT i.iname,                                                                 |
|       i.clusterratio,                                                          |
|       i.cluster,                                                               |
|       t.rowcount,                                                              |
|       t.npages,                                                                |
|       (t.rowcount-(i.clusterratio/10000.0*(t.rowcount-t.npages))),             |
|       ((t.rowcount-(i.clusterratio/10000.0*(t.rowcount-t.npages))))/t.npages   |
|       FROM system.sysindexes i, system.syscatalog t                            |
|       WHERE t.tname = i.tname and                                              |
|             t.creator = i.creator and                                          |
|             t.npages > 0 and                                                   |
|             t.rowcount > 0                                                     |
|       ORDER BY 7 desc                                                          |
+--------------------------------------------------------------------------------+

Reorganizing Data

You should reorganize data for one of four reasons:

A table's clustering index has become unclustered. Refer to Identifying Unclustered Indexes.
The number of overflow rows in a table (NOVERFLOW) exceeds 5% of the total number of rows in the table (ROWCOUNT). Refer to Free Space in Data Pages.
To change which index acts as the clustering index.
To return once populated but now empty data pages to the storage pool, refer to Storage Pool Full.

The first two conditions indicate that the rows can no longer be efficiently retrieved.

Essentially, reorganizing involves unloading the data and reloading it. Unload the data, making sure that the clustering index exists. If a clustering index is available, the data is unloaded following its sequence. Drop the clustering index and reload the data (it will be reloaded in the order of the clustering index). Then recreate the clustering index. This reclusters the data according to the clustering index, and reclaims space that was lost because of row overflow.

The following instructions assume that you are using the DBS utility to unload and reload tables. For more information on its use, refer to the DB2 Server for VSE & VM Database Services Utility manual.

There are several questions you need to ask before you choose a way to reorganize your data.

If you are reorganizing all the tables in a dbspace at once and there are not many tables in that dbspace with field procedures), and it is not difficult for you to recreate all the indexes, referential constraints, and unique keys in that dbspace, follow the instructions in Reorganizing all the Tables in a Dbspace. While this set of instructions requires you to recreate the entire dbspace, the actual process of reloading the data is faster than the following alternative.
If you only need to reorganize one or two tables or if your table contains columns with field procedures, or if it is too much work to recreate all their indexes, referential constraints, and unique keys, follow the instructions in Reorganizing a Single Table. However, if you also want to change which index acts as the clustering index, you will have to drop and recreate all the table's indexes.
If you only want to change the clustering index in a single table that uses several other indexes, follow the instructions in Changing the Clustering Index without Dropping Indexes. Unlike the previous procedure, this set of instructions does not require you to drop and recreate all the table's indexes.

Reorganizing a Single Table

The following method reorganizes a single table. It uses the DBS Utility RELOAD PURGE command. While it is not as fast as the RELOAD NEW command, you do not need to manually drop and recreate any indexes, referential constraints, and unique keys (unless you want to change the clustering index).

If you want to change which index acts as the clustering index, do the following:
1. Drop all indexes for the table by issuing a DROP INDEX statement for each one.
2. Create a new index (using the CREATE INDEX statement). This index will act as the clustering index.
Unload the table (usually to tape), by issuing a DBS Utility UNLOAD TABLE command. The rows are automatically unloaded in the key sequence of the clustering index.
Set the PCTFREE value of the dbspace to a high enough value to allow space on pages for future clustered insertion of rows.
Set UPDATE STATISTICS ON if you want to automatically collect statistics during the RELOAD, or set it OFF if you plan to UPDATE ALL STATISTICS after the RELOAD. Refer to Automatic Statistics Collection.
Reload the table by issuing a DBS utility RELOAD command with the PURGE option specified.
During RELOAD command processing with the PURGE option specified, all rows of the specified table are deleted. As part of the PURGE, the DBS utility drops the clustering index, deactivates any active primary keys, active foreign keys, and active unique keys, and deletes all indexes on the table before deleting and reloading the data. After the table has been reloaded, the DBS utility recreates the clustering index, primary key, and unique keys, and recreates the remaining indexes. It then reactivates all the foreign keys it dropped. Since packages are invalidated because of table index deletions, they are dynamically repreprocessed the next time someone attempts to execute the package.
Reduce PCTFREE to make the free space available for use on normal INSERT activity.
If you set UPDATE STATISTICS OFF, collect statistics for all columns by issuing the UPDATE ALL STATISTICS command.
If you changed which indexes acts as the clustering index, recreate the other table indexes required, using CREATE INDEX statements. The definition of all indexes on a table can normally be determined by querying the SYSTEM.SYSINDEXES system catalog table, as long as the length of the column names on which the index is defined is less than 100 characters.

Reorganizing all the Tables in a Dbspace

The following method reorganizes all the tables in a dbspace.

Record any index, referential constraint, unique key definitions, or field procedures authorizations in the dbspace.
If you want to change which index acts as the clustering index for any tables in the dbspace, do the following for those tables:
1. Drop all indexes for the table by issuing a DROP INDEX statement for each one.
2. Create a new index (using the CREATE INDEX statement). This index will act as the clustering index.
Unload the dbspace (usually to tape), by issuing a DBS Utility UNLOAD DBSPACE command. The tables will be unloaded in the order of the clustering index.
Drop and recreate the dbspace.
Set the PCTFREE value of the dbspace to a high enough value to allow space on pages for future clustered insertion of rows.
Set UPDATE STATISTICS ON if you want to automatically collect statistics during the RELOAD, or set it OFF if you plan to UPDATE ALL STATISTICS after the RELOAD. Refer to Automatic Statistics Collection.
Reload the dbspace by issuing a DBS Utility RELOAD DBSPACE command with the NEW option specified.
The NEW option assumes that none of the tables you are reloading currently exist in the dbspace. A program that accesses a table, the index of which was dropped, is re-preprocessed when it is next executed, which ensures that it takes advantage of the new clustering properties.

If a table in the dbspace has field procedures associated with it, the table should be dropped and recreated to include the field procedures and reloaded using the PURGE parameter. It is not necessary to unload the table again, as the table can be reloaded from the unloaded dbspace file.
Recreate the clustering index.
Reduce PCTFREE to make the free space available for use on normal INSERT activity.
If you set UPDATE STATISTICS OFF, collect statistics for all columns by issuing the UPDATE ALL STATISTICS command.
Recreate the other table indexes, any referential constraints and any unique keys.

Changing the Clustering Index without Dropping Indexes

The following method reorganizes a single table, and changes which index will act as the clustering index. It eliminates the need to individually drop and recreate all indexes on the table. (The steps can be performed in a single execution of the DBS Utility.)

On the SYSTEM.SYSINDEXES table entry for the original clustering index, update the CLUSTER column value of "F" or "W" to "N".
Change the value in the CLUSTERRATIO column to 1000 (10.00%).
If the new clustering index does not exist, create it with a CREATE INDEX statement.
On the SYSTEM.SYSINDEXES catalog table entry, update the CLUSTER column for the new clustering index to the value "W".
Change the value in the CLUSTERRATIO column to 7500 (75.00%).
Unload the table by issuing a DBS Utility UNLOAD TABLE command.
Set UPDATE STATISTICS ON if you want to automatically collect statistics during the RELOAD, or set it OFF if you plan to UPDATE ALL STATISTICS after the RELOAD. Refer to Automatic Statistics Collection.
Reload the table by issuing a DBS Utility RELOAD command with the PURGE option specified.
During RELOAD command processing with the PURGE option specified, all rows of the specified table are deleted and the table index (if one exists) is dropped and recreated. A program that accesses a table, the index of which was dropped, is re-preprocessed when it is next executed, which ensures that it takes advantage of the new clustering properties.
If you set UPDATE STATISTICS OFF, collect statistics for all columns by issuing the UPDATE ALL STATISTICS command.

Index Fragmentation

A fragmented index is characterized by excessive amounts of free space in the index pages, which usually is spread unevenly among the pages. Free space distributed unevenly implies that index keys are also distributed unevenly. Indexes can become fragmented by insert, delete, and update activity on the table.

To help prevent index fragmentation, indexes should be created after the data has been loaded into the table, and an adequate PCTFREE value should be specified for the index.

If the index is created before the data is loaded, page splits occur and the index becomes fragmented when the data is loaded. In fact, if the data is loaded in clustering order, each index page of the clustering index has 50% free space.

If a sufficient PCTFREE value is specified for the index when it is created, subsequent inserts do fit on the existing index page, avoiding index page splits.

Indexes must either be reorganized or dropped and recreated to correct the fragmentation. If they are dropped and recreated, any packages with dependencies on them are marked invalid. In addition, if a clustering index is dropped, it no longer functions as the clustering index if there are other indexes on the table. In this case, all indexes would have to be dropped, the clustering index recreated, and then the rest of the indexes recreated. If indexes are reorganized, dependent packages are not marked invalid, and the clustering properties do not change.

Reorganizing Fragmented Indexes

To determine whether an index should be reorganized, enter the SHOW DBSPACE operator command to see how many index pages are occupied in the dbspace, and what the actual percentage of free space in the occupied pages is. Next, determine the expected percentage of free space by averaging the PCTFREE settings of all the indexes. If the actual free space is appreciably higher than the expected amount, index fragmentation or skewed index values are the likely cause.

There are two ways to reorganize an index. One is to obtain all index definitions from the catalog tables, drop the index with the DROP INDEX statement, then recreate it with the CREATE INDEX statement.

The other is to enter the following DBS Utility command:

   REORGANIZE INDEX (index-name)

You must be the owner of the index or have DBA authority.

The advantages of the REORGANIZE INDEX utility are:

A dbspace scan is not required to retrieve the rows of the table.
A sort of the index key columns is not required.
Dependent packages are not invalidated and therefore do not require re-preprocessing.
The clustering property of a clustering index is not lost. (If there is more than one index on a table, and a clustering index is reorganized by being dropped and re-created, it is no longer the clustering index.)

For more information on the REORGANIZE INDEX utility, see the DB2 Server for VSE & VM Database Services Utility manual.

Notes:

You must use the ALTER TABLE statement to reorganize an index that was created by the database manager to enforce the uniqueness of a primary key or a unique constraint (see the DB2 Server for VSE & VM Database Administration manual).
A different utility is provided to reorganize the catalog table indexes (see the DB2 Server for VSE & VM Database Administration manual.).
Reorganizing an index is not a solution for an unclustered index. To correct an unclustered index, you must reorder the data to match the index sequence, refer to Reorganizing Data. In addition, issuing the REORGANIZE INDEX command does not return freed pages to the storage pool. The freed pages are only returned to the storage pool if you drop the dbspace.

Invalid Indexes

An index can become invalid in the following ways.

During a ROLLBACK or UNDO operation, if the database manager requires a free index page but is unable to reclaim any, the index is marked invalid. More than one index can become invalid during the LUW. Rollback, UNDO, or REDO processing continues, but no updates are made to invalid indexes, and thus they no longer reflect the data. These indexes cannot be used until they have been reorganized, or, dropped and recreated.
An index can be marked invalid if duplicates have occurred in a unique index. This can only happen if:
- a checkpoint occurs during a searched UPDATE deferring checking of uniqueness,
- a system failure occurs before the end of the statement, and
- the database is started with an empty log.
At the end of initialization, any unique indexes that contain duplicates are marked invalid.
An index can also be marked invalid if the following events occur in order:
- A checkpoint occurs during a CREATE or REORGANIZE INDEX.
- A system failure occurs before the database manager can complete the CREATE or REORGANIZE statement.
- The application server is restarted with an empty log.

When an index is marked invalid, packages that use that index are not marked invalid; however, the packages will become invalid if the index is dropped. If the index is reorganized, the packages will remain valid.

Additional details about invalid indexes can be found under the SHOW INVALID command in the DB2 Server for VSE & VM Operation manual.

Transient Indexes

An index can be marked transient in the following ways.

An index is marked transient during a CREATE INDEX statement or REORGANIZE INDEX command. In this case, the index remains transient for the duration of the statement. When the index has been created or reorganized successfully, the index is marked valid.
A unique index can be marked transient during a searched UPDATE statement where uniqueness checking is being deferred. In this case, the index remains transient for the duration of the LUW. The index is marked transient when the first duplicate is inserted. When the statement is completed, if duplicates still exist SQLCODE -803 (SQLSTATE 23505) is issued, and the UPDATE statement is rolled back. The index is marked valid at the end of the LUW.

Additional details about transient indexes can be found under the SHOW INVALID command in the DB2 Server for VSE & VM Operation manual.

Reorganizing an Invalid Index

Use the SHOW INVALID operator command to display all invalid indexes in the database, as well as the reason why each index is invalid.

Use the REORGANIZE INDEX utility to revalidate an invalid index that is invalid because you encountered a NO ROOM IN THE STORAGE POOL message.

If the invalid index was created to support a primary key or a unique constraint, it can be reorganized with the ALTER TABLE table_name ACTIVATE key_name command.

When reorganizing an invalid index, the database manager must scan the dbspace and sort the index keys, because the invalid index may not contain all the keys.

You cannot use the REORGANIZE INDEX utility to revalidate a unique index that contains duplicates causing it to be marked invalid. You must drop this index, remove the duplicates, and re-create it. If the index was created to support a primary key or a unique constraint, you must deactivate the primary key or unique constraint with the ALTER TABLE table_name DEACTIVATE key_name command, remove the duplicates, and reactivate the primary key or unique constraint with the ALTER TABLE table_name ACTIVATE key_name command.

[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]