Data Warehouse Center Administration Guide

Chapter 10. Maintaining the warehouse database

This chapter describes how to use warehouse programs to maintain your warehouse database:

You can use the DB2 reorganize warehouse program to rearrange a table in physical storage, eliminating fragmentation and making sure that the table is stored efficiently in the database. You can also use reorganization to control the order in which the rows of a table are stored, usually according to an index.
You can use the DB2 runstats warehouse program to gather statistics about the physical and logical characteristics of a table and its indexes. DB2 Universal Database uses these statistics to determine the best way to access your data.

For information about tuning and configuring your database for optimal performance, see DB2 Universal Database Administration Guide: Performance.

Reorganizing data

You can reorganize data in a DB2 Universal Database table or in a DB2 UDB for OS/390 table space or index.

Defining values for a DB2 Universal Database REORG program

Use the Step Properties notebook for the DB2 UDB REORG warehouse program to create a step that can be used to reorganize data in a DB2 Universal Database table.

You can use a warehouse source or target as a source for this step subtype. The REORG program writes to the source table.

The Column Mapping page is not available in this step.

To define values for a step that runs a DB2 UDB REORG warehouse program:

Open the step notebook.
Specify general information about the program. For more information, see Providing general information about a step.
Optional: On the Parameters page, specify information for the REORG step:
- In the Using temporary table space field, type the name of the temporary table space that should be used during the REORG step.
- In the Using index field, type the name of the index that should be used during the REORG step.
On the Processing Options page, provide information about how your step processes. For more information, see Defining processing options.
Click OK to save your changes and close the step notebook.

Defining values for a DB2 UDB for OS/390 Reorganize Table Space program

Use the DB2 UDB for OS/390 Reorganize Table Space warehouse program to reorganize a table space or index to improve access performance and to reclaim fragmented space. In addition, you can reorganize a single partition of either a partitioned index or a partitioned table space.

To define a values for a step that uses this warehouse program:

Open the step notebook.
Specify general information about the warehouse program. For more information, see Providing general information about a step.
On the Parameters page, double-click the Parameter value field to add values for the corresponding parameters shown in the Parameter name field. Here is a list of the parameters and their descriptions:
UTILITY_ID
Specifies a unique identifier (UID) for this utility within DB2. This is a string of 1-16 characters.
RESTART
Specifies whether this warehouse program restarts a current utility, and, if so, at what point it is to be restarted. NO or null indicates the utility is new, not a restart. There must not be any other utility with the same utility identifier (UID). The default is null.
UTSTMT
Specifies the utility control statements. This is an input parameter of type VARCHAR(32704).
RETCODE
Specifies the utility highest return code. This is an output parameter of type INTEGER.
UTILITY_NAME
Specifies the utility that you want to start. Specify only a single utility that requires data set allocation in the UTSTMT parameter. The utility name is one of the following:
- REORG INDEX
- REORG LOB
- REORG TABLESPACE
RECDSN
Specifies the cataloged data set name that is required when reorganizing table spaces as the unload data set. If you specify RECDSN, it will be allocated to the SYSREC DDNAME.
RECDEVT
Specifies a unit address, a generic device type, or a user-assigned group name for a device on which the data set listed in RECDSN resides.
RECSPACE
Specifies the number of cylinders to use as the primary space allocation for the data set specified by RECDSN. The secondary space allocation will be 10% of the primary space allocation.
DISCDSN
Specifies the cataloged data set name used when reorganizing table spaces as a discard data set to hold records not reloaded. If you specify DISCDSN, it will be allocated to the SYSDISC DDNAME.
DISCDEVT
Specifies a unit address, a generic device type, or a user-assigned group name for a device on which the data set specified by DISCDSN resides.
DISCSPACE
Specifies the number of cylinders to use as the primary space allocation for the data set specified by DISCDSN. The secondary space allocation will be 10% of the primary space allocation.
PNCHDSN
Specifies the cataloged data set name that is used when reorganizing table spaces with the keywords UNLOAD EXTERNAL or DISCARD. The data set is used to hold the generated LOAD utility control statements. If you specify a value for PNCHDSN, it will be allocated to the SYSPUNCH DDNAME.
PNCHDEVT
Specifies a unit address, a generic device type, or a user-assigned group name for a device on which the data set specified by PNCHDSN resides.
PNCHSPACE
Specifies the number of cylinders to use as the primary space allocation for the data set specified by PNCHDSN. The secondary space allocation will be 10% of the primary space allocation.
COPYDSN1
Specifies the name of the target (output) data set. If you specify COPYDSN1, it will be allocated to the SYSCOPY DDNAME.
COPYDEVT1
Specifies a unit address, a generic device type, or a user-assigned group name for a device on which the data set specified by COPYDSN1 resides.
COPYSPACE1
Specifies the number of cylinders to use as the primary space allocation for the data set specified by COPYDSN1. The secondary space allocation will be 10% of the primary space allocation.
COPYDSN2
Specifies the name of the cataloged data set used as a target (output) data set for the backup copy. If you specify COPYDSN2, it will be allocated to the SYSCOPY2 DDNAME.
COPYDEVT2
Specifies a unit address, a generic device type, or a user-assigned group name for a device on which the data set specified by COPYDSN2 resides.
COPYSPACE2
Specifies the number of cylinders to use as the primary space allocation for the data set specified by COPYDSN2. The secondary space allocation will be 10% of the primary space allocation.
RCPYDSN1
Specifies the name of the cataloged data set used as a target (output) data set for the remote site primary copy. If you specified RCPYDSN1, it will be allocated to the SYSRCPY1 DDNAME.
RCPYDEVT1
Specifies a unit address, a generic device type, or a user-assigned group name for a device on which the RCPYDSN1 data set resides.
RCPYSPACE1
Specifies the number of cylinders to use as the primary space allocation for the data set specified by RCPYDSN1. The secondary space allocation will be 10% of the primary space allocation.
RCPYDSN2
Specifies the name of the cataloged data set used as a target (output) data set for the remote site backup copy. If you specify RCPYDSN2, it will be allocated to the SYSRCPY2 DDNAME.
RCPYDEVT2
Specifies a unit address, a generic device type, or a user-assigned group name for a device on which the data set specified by RCPYDSN2 resides.
RCPYSPACE2
Specifies the number of cylinders to use as the primary space allocation for the data set specified by RCPYDSN2. The secondary space allocation will be 10% of the primary space allocation.
WORKDSN1
Specifies the name of the cataloged data set that is required as a work data set for sort input and output. If you specify WORKDSN1, it will be allocated to the SYSUT1 DDNAME.
WORKDEVT1
Specifies a unit address, a generic device type, or a user-assigned group name for a device on which the data set specified by WORKDSN1 resides.
WORKSPACE1
Specifies the number of cylinders to use as the primary space allocation for the data set specified by WORKDSN1. The secondary space allocation will be 10% of the primary space allocation.
WORKDSN2
Specifies the name of the cataloged data set that is required as a work data set for sort input and output. It is required if you are using reorganizing non-unique type 1 indexes. If you specify WORKDSN2, it will be allocated to the SORTOUT DDNAME.
WORKDEVT2
Specifies a unit address, a generic device type, or a user-assigned group name for a device on which the data set specified by WORKDSN2 resides.
WORKSPACE2
Specifies the number of cylinders to use as the primary space allocation for the WORKDSN2 data set. The secondary space allocation will be 10% of the primary space allocation.
MAPDSN
Specifies the name of the cataloged data set that is required as a work data set for error processing during LOAD with ENFORCE CONSTRAINTS. It is optional for LOAD. If you specify MAPDSN, it will be allocated to the SYSMAP DDNAME.
MAPDEVT
Specifies a unit address, a generic device type, or a user-assigned group name for a device on which the data set specified by MAPDSN resides.
MAPSPACE
Specifies the number of cylinders to use as the primary space allocation for the MAPDSN data set. The secondary space allocation will be 10% of the primary space allocation.
ERRDSN
Specifies the name of the cataloged data set that is required as a work data set for error processing. If you specify ERRDSN , it will be allocated to the SYSERR DDNAME.
ERRDEVT
Specifies a unit address, a generic device type, or a user-assigned group name for a device on which the data set specified by ERRDSN resides.
ERRSPACE
Specifies the number of cylinders to use as the primary space allocation for the ERRDSN data set. The secondary space allocation will be 10% of the primary space allocation.
FILTRDSN
Specifies the name of the cataloged data set that is required as a work data set for error processing. If you specify FILTRDSN, it will be allocated to the FILTER DDNAME.
FILTRDEVT
Specifies a unit address, a generic device type, or a user-assigned group name for a device on which the data set specified by FILTRDSN resides.
FILTRSPACE
Specifies the number of cylinders to use as the primary space allocation for the FILTRDSN data set. The secondary space allocation will be 10% of the primary space allocation.
On the Processing Options page, provide information about how your step processes. For more information, see Defining processing options.
Click OK to save your changes and close the step notebook.

Updating system catalog statistics

You can update system catalog statistics for a DB2 UDB or DB2 for OS/390 database.

Defining values for a DB2 UDB RUNSTATS program

Use the Step Properties notebook for the DB2 UDB RUNSTATS warehouse program to create a step that can be used to update system catalog statistics on the data in a table, the data in the table indexes, or the data in both the table and its indexes. The optimizer uses these statistics to choose which path will be used to access the data.

In general, you need to update statistics if there are extensive changes to the data in the table.

You can use a warehouse source or target as a source for this step subtype. The RUNSTATS program writes to the source table. Link a source to the step subtype in the Process Model window before you define the values for the step.

The Column Mapping page is not available in this step.

To define values for a step that runs a DB2 UDB RUNSTATS warehouse program:

Open the step notebook.
Specify general information about the warehouse program. For more information, see Providing general information about a step.
Optional: On the Parameters page, specify information for the RUNSTATS warehouse program:
- Specify the level of statistics you want to gather for the table by clicking a radio button under Statistics for the table.
- Specify the level of statistics you want to gather for the table's indexes by selecting a radio button under Statistics for the indexes.
- Use the Share level radio buttons to specify the type of access you want other users to have to the table while the statistics are being gathered.
On the Processing Options page, provide information about how your step processes. For more information, see Defining processing options.
Click OK to save your changes and close the step notebook.

Defining values for a DB2 UDB OS/390 RUNSTATS program

Use the DB2 UDB for OS/390 RUNSTATS warehouse program to gather summary information about the characteristics of data in table spaces, indexes and partitions. The output is recorded in the DB2 UDB for OS/390 catalog tables. DB2 uses this information to select access paths to data during the bind process. The information is available to the database administrator for evaluating database design and to aid in determining when table spaces or indexes must be reorganized.

The Column Mapping page is not available for this step subtype.

To define values for a step that uses this warehouse program:

Open the step notebook.
Specify general information about the warehouse program. For more information, see Providing general information about a step.
On the Parameters page, click one of the Share level radio buttons to specify the access that other programs can use on the table space while you are gathering statistics:
- Specify Reference only if you want to allow read-only access by other programs. This is the default.
- Specify Allow changes if you want to allow other programs to change the table space or index. Uncommitted data can be collected into statistical summaries.
Click one of the Report statistics radio buttons to specify if a set of messages is generated to report the collected statistics. Specify Yes if you want the set of messages sent to the SYSPRINT data set. Specify No to not send messages to SYSPRINT. This is the default.
Click one of the Update options on catalog tables radio buttons to specify whether the collected statistics are inserted into the DB2 for OS/390 catalog tables:

Access path and space statistics
The result of gathering the statistics updates all the statistics in the DB2 catalog.
Access path statistics only
The result of gathering the statistics updates only the catalog table columns used for access path selection.
Space statistics only
The result of gathering the statistics updates only the catalog table columns that help a database administrator assess the status of a particular table space or index.
Do not update catalog tables
The result of gathering the statistics does not update any catalog statistics. Available only when the Report Statistics radio button is YES.
In the Percent rows sampled for non-index columns field, specify a percentage of rows to sample when collecting non-indexed column statistics. Valid values are 1 to 100 percent. The default is 25 percent. Do not use this field if your table space is a LOB table space.
The columns in a table for which you are gathering statistics are listed in the Available columns list. Click on a column and click > to move a column from the Available columns list to the Selected columns list. You can move all the columns in the table by clicking >>. Click <or << to move columns from the Selected columns list to the Available columns list. All columns in the Selected columns list are the columns on which statistics are gathered. You can select a maximum of 10 columns with one table space statistics operation.
On the Processing Options page, provide information about how your step processes. For more information, see Defining processing options.
Click OK to save your changes and close the step notebook.

[ Top of Page | Previous Page | Next Page ]