Planning for the generation of a database entails establishing logical and physical limits for its capacity, and setting its initial DASD allocations.
The parameters that you must establish at this time are summarized in Table 2. This figure also shows the IBM-provided values used for the starter database.
Note: | The parameters that have a Yes entry in the Fixed column must be established during generation of the database, and cannot be changed for the lifetime of the database. Also note that some parameters are established by the VM directory MDISK control statements, whereas others are established by input to an IBM-supplied EXEC called SQLDBGEN. |
Following the figure is a discussion of how to set these parameters, and of
the issues to consider when setting them.
Table 2. Database Parameters Set at Database Generation Time
Parameter | Default | Minimum | Maximum | Starter Database | Fixed | Set by |
---|---|---|---|---|---|---|
Database directory size | None |
1 cylinder | 1 volume | 34 cylinders | No | MDISK |
Log data set (or data sets) -Size (each) -Number |
None None |
1 cylinder 1 |
524,287 4Kb pages 2 volumes |
8 cylinders 1 |
No |
MDISK |
Maximum number of storage pools (MAXPOOLS) | 32 | 1 | 999 | 256 | Yes | SQLDBGEN |
Maximum number of dbextents (MAXEXTNT) | 64 | 1 | 999 | 256 | Yes | SQLDBGEN |
Maximum number of dbspaces (MAXDBSPC) | 1000 | 7 | 32000 | 10240 | Yes | SQLDBGEN |
Catalog dbspace (PUBLIC.SYS0001) Size (4 kilobyte pages) | None | 128 | 8388607 | 12800 | Yes | SQLDBGEN |
First package dbspace (PUBLIC.SYS0002) Size (4 kilobyte pages) | None | 128 | 8388607 | 2048 | Yes | SQLDBGEN |
HELP text dbspace (PUBLIC.HELPTEXT) Size (4 kilobyte pages) | None | 2304 | 8388607 | 8192 | No | SQLDBGEN |
ISQL dbspace (PUBLIC.ISQL) Size (4 kilobyte pages) | None | 128 | 8388607 | 1024 | No | SQLDBGEN |
SAMPLE dbspace (PUBLIC.SAMPLE) Size (4 kilobyte pages) | None | 512 | 8388607 | 512 | No | SQLDBGEN |
Internal dbspaces -Size (each) (4 kilobyte pages) -Number |
None None |
128 2 |
8388607 31997 |
1024 80 |
No |
SQLDBGEN |
Initial dbextents -Size (each) -Number |
None None |
1 cylinder 1 |
1 volume 999 |
77 cylinders 1 |
No |
MDISK |
Notes:
The DB2 Server for VM directory (called BDISK) contains control information and page tables for mapping dbspace page references to physical DASD locations. Its size determines the maximum number of dbextent pages and the number of page table entries that can be supported by the database being generated.
If necessary, you can later expand the directory to hold more dbspace pages, or more dbspace and dbextent pages. Refer to Expanding the Database Directory for more details.
The directory for the database is defined by adding an MDISK control statement to the VM directory entries for a database machine. If Data Spaces Support is used, 4096-byte blocks can be used for the directory, but otherwise the database manager requires the use of 512-byte blocks for its directory. The SQLDBGEN EXEC does the actual formatting of the minidisk. The MDISK parameters you supply determine the number of blocks in the directory minidisk.
Table 3 shows the recommended cylinder (or block) allocations for
various DASD device types, based on assumed maximum database sizes.
Table 3. Recommended Directory Allocations for Various Database Sizes
Directory Space for Various IBM Storage Devices | |||||
---|---|---|---|---|---|
Maximum Database Size |
3375 |
3380 |
3390 |
9345 |
FB-512 BLOCKS |
10 megabytes |
TRK(3) |
TRK(3) |
TRK(3) |
TRK(4) |
BLK(124) |
50 megabytes |
TRK(7) |
TRK(6) |
TRK(6) |
TRK(7) |
BLK(310) |
100 megabytes |
CYL(1) |
TRK(11) |
TRK(10) |
TRK(12) |
BLK(496) |
500 megabytes |
CYL(5) |
CYL(4) |
CYL(4) |
CYL(6) |
BLK(2232) |
1 gigabyte |
CYL(10) |
CYL(8) |
CYL(7) |
CYL(11) |
BLK(4480) |
2 gigabytes |
CYL(19) |
CYL(16) |
CYL(14) |
CYL(21) |
BLK(8866) |
4 gigabytes |
CYL(38) |
CYL(32) |
CYL(27) |
CYL(42) |
BLK(17696) |
5 gigabytes |
CYL(47) |
CYL(40) |
CYL(34) |
CYL(52) |
BLK(22080) |
10 gigabytes |
CYL(92) |
CYL(80) |
CYL(68) |
CYL(101) |
BLK(44144) |
50 gigabytes |
CYL(459) |
CYL(400) |
CYL(337) |
CYL(504) |
BLK(220286) |
Note: | The values in this table apply when the defaults are used for MAXPOOLs, MAXDBSPC, and MAXEXTNT. These parameters are described in Establishing Database Capacity Parameters. |
Use Table 3 to choose the initial directory size. Detailed information for generating its values is contained in Appendix B, Estimating Database Storage. When estimating the maximum database size, include the sizes of the public, private, and internal dbspaces.
The directory minidisk for the starter database supports about 4.9 gigabytes of data. This includes space for internal dbspace definitions so the actual space supported for public and private dbspaces is about 4.6 gigabytes.
The directory minimum size cannot extend beyond a single volume; therefore, the maximum database size is limited by the single volume capacity of the device type used. The absolute maximum size for a database is either 64 gigabytes or the limit imposed by the device type, whichever is smaller. For the limits imposed by various devices, see Table 41 and Table 42.
The directory minidisk will be used extensively by the database manager for resolution of data addresses. Thus, you should not allocate it to a volume that will contain either the log minidisks or heavily used data dbextents. Instead, place it on a separate volume to avoid device contention.
If DASD is limited on your system and the directory must share a volume with data dbextents, put it on a volume with a dbextent that contains infrequently referenced data. For example, sharing a volume with private dbspaces or historical data is preferable to sharing one with public dbspaces or current, highly active data.
The database manager requires at least one log minidisk and can support two. It is recommended that you use two logs.
The log minidisks contain information, recorded during database processing, that is used to support database recovery facilities. This includes control information (for example, COMMIT statement and checkpoint records) and the specifics of database changes (for example, inserts, updates, and deletes).
If you define two log minidisk they must be exactly the same size. Do not define them on different device types because it is almost impossible (because of rounding) to get identically sized data sets using space allocation algorithms.
The log history area, which is the final page of the log, is copied to the database machine's A-disk as the file ARIHSDS ARCHIVE immediately after a successful database or log archive. This A-disk file is used during a subsequent restore, if the log history area is unusable due to a log failure. The A-disk file is also copied to the file ARIHSDS PRECLDLG when a COLDLOG RECONFIGURE is done to ensure recoverability.
The size of the log data setis specified by the VM MDISK control statement, as shown in Figure 90. The size you specify will depend on the use of the database and on the type of recovery capabilities you want. If you underestimate this size at database generation time, you can redefine it afterwards, as described in Log Reconfiguration .
The log size depends on the number of changes that you expect will be made to the database and on whether or not you plan to use archiving facilities. If either database or log archiving is enabled, the log must be large enough to hold all the logging done between archives; otherwise it need only be large enough to hold the logging done in a few hours.
Note: | If you are putting dbspaces in nonrecoverable storage pools, keep in mind that only minimal logging is done for them, so the following log size considerations would not apply to those dbspaces. |
If you run the database manager without the archiving facilities (LOGMODE=Y or N), log space is reclaimed as applications finish and checkpoints of the database are taken. Usually, this occurs every few seconds or every few minutes. Many uses of the database manager can be supported by a log size of only one or two cylinders; however, a long-running application may require more log space.
Typically, the largest demand for log space is online loading or data reorganization. These processes run longer than most applications and cause a lot of logging to occur.
A starting estimate for the initial log size is twice the space requirements of your largest dbspace. If you have one exceptionally large dbspace, you can disregard it and use the size of the next largest dbspace. The data in the largest dbspace can be loaded and reorganized offline with logging inhibited.
If you are using the archiving facilities (LOGMODE=A or L), log space is not reclaimed until an archive is taken. That is, log space is not reused between archives of the log or database. Typically, you would only archive the database once or twice a week. You may choose to do log archiving more frequently, depending on database usage.
To estimate the size of the log, consider the amount of logging that will occur between archives. A useful approach is to estimate the percentage of data that will be generated, deleted, and changed over one archive period as follows:
logsize estimate = (percentage generated + percentage deleted + percentage changed x 2) x database size
For example, assume that in a one-week period the database size grows by 5% but also shrinks by 4%, and that 6% of the database (rows) are changed. Your estimate for the log size would be:
logsize estimate = .21 x database size
If your database size were 100 megabytes and you wanted an archive period of one week, your log size estimate would be:
logsize estimate = 21 megabytes
This is approximately 30 cylinders of an IBM 3390 DASD device.
The log requirements for processing the DBS utility DATALOAD and RELOAD commands in multiple user mode are:
The log space consumption caused by these operations can be avoided by running the DBS utility in single user mode with LOGMODE=N specified, or by using the COMMITCOUNT option to force periodic checkpoints in multiple user mode.
Like the directory minidisk the log minidisks are frequently referenced during processing. To avoid device contention, they should reside on separate volumes from the directory or heavily used dbextents.
If two log minidisks are defined, place them on separate volumes. If they were allocated to the same one, loss of that volume would cause the loss of both logs, thus defeating the purpose of dual logging.
The database machine A-disk should be on a volume separate from the log minidisks. If it is allocated to the same volume as either log, loss of that volume would result in loss of both copies of the log history area (that is, the one on the log itself and the one on the A-disk) thus defeating the purpose of having two copies of the history area.
The MAXPOOLS, MAXEXTNT, MAXDBSPC, and CUREXTNT keyword control statements can be specified as input to database generation. The SQLDBGEN EXEC calls the program ARISQLDS with STARTUP=C to process these control statements. The first three of these statements are optional. The last one must be specified.
The MAXPOOLS, MAXEXTNT, and MAXDBSPC values are fixed when the database is generated: once defined, they cannot be changed for its lifetime. To avoid future limitation problems, it is recommended that you set them to the allowed maximums. This will take about 1 cylinder of DASD on a 3380 device for the directory, and 280K virtual storage when the database manager is running.
The MAXPOOLS specification determines the maximum number of storage pools that can be defined in the database. Storage pools control the location of data on DASD volumes - that is, what dbspaces are located on what volumes. You can make a generous estimate for MAXPOOLS, since the value specified results in only a small directory space allocation for each potential storage pool. You should plan on having one storage pool for each user group (or billing account), and one for each major application you expect the database to support.
The MAXEXTNT controls the maximum number of dbextents that are defined to support the database being generated. Dbextents determine the physical allocation of DASD space for a storage pool.
Because a dbextent is a VM minidisk, it cannot span DASD volumes. This means that you need at least as many dbextents as volumes. You can, of course, define multiple dbextents on one volume. It also means that if you have a dbspace that spans multiple volumes, the corresponding storage pool requires multiple dbextents.
Because you should plan to support multiple dbextents for each storage pool and you should be prepared to extend most, if not all, of your planned storage pools, MAXEXTNT should be much larger than MAXPOOLS. Your estimate for it can be generous because this value results in only a small directory space allocation for each potential dbextent.
MAXDBSPC controls the maximum number of dbspaces, including internal dbspaces, that can be defined for the database. See Determining the Internal Dbspace Requirements. A dbspace is a logical allocation of database space for holding one or more tables and their indexes. A dbspace is assigned to a storage pool when it is defined and draws on the actual DASD space available in that storage pool on an as-needed basis. Typically, dbspaces are defined to support private space allocations for individual users and space allocations for specific applications; thus, the number of dbspaces required generally depends on the number of users and the number of tables needed for applications. Each user probably requires from one to five private dbspaces over the lifetime of the database, and each application requires, at most, one dbspace for each table being accessed. For performance reasons, one table per dbspace is recommended.
As with the previous two parameters, your estimate for MAXDBSPC can be generous, because the value you specify will result in only a small allocation of directory space for each potential dbspace.
CUREXTNT determines the number of dbextents defined during database generation. This number should be sufficient to support your initial storage requirements. You can add more dbextents after database generation.
Any public dbspace that has SYS as the first three characters in its name is reserved for system use only. The system dbspaces established at database generation time are PUBLIC.SYS0001, PUBLIC.SYS0002, PUBLIC.HELPTEXT, PUBLIC.ISQL, and PUBLIC.SAMPLE.
This section presents only general concepts related to setting the initial dbspace sizes. For more information, see Specifying Initial Dbspaces and Appendix B, Estimating Database Storage.
Note: | Physical space is not actually consumed until required, so you can afford to define the SYS0001 dbspace to be very large. Be generous: this dbspace cannot be dropped or recreated after the database is generated. If you make it too small and SYS0001 runs out of usable space, you will have to regenerate the database which can be a considerable task. |
The ARISDBU MACRO contains SQL statements to acquire the public dbspaces HELPTEXT, ISQL, and SAMPLE. If you want to increase their size, update the appropriate ACQUIRE DBSPACE statement in ARISDBU.
Except for PUBLIC.SAMPLE, the sizes that you establish for system dbspaces at database generation time can limit the logical capacity of your database. Because physical space is not actually used until required, you should establish large sizes for them. The large recommended sizes shown in Figure 2 will support most uses of the database manager.
Figure 2. Guidelines for the Sizes of the System Dbspaces
When you generate the database, you need only consider the dbspace requirements for its initial use. To determine the initial user dbspace requirements, either consult with the database administrator or refer to the DB2 Server for VSE & VM Database Administration manual. The SQLADBSP EXEC can be used to add more later, up to the MAXDBSPC value.
For more information, refer to Chapter 7, Managing Database Storage.
The database manager uses internal dbspaces to process commands that require sort operations and to process views that require materialization. For information on sorting and materialization, see the DB2 Server for VSE & VM Database Administration manual.
The internal dbspaces are held until a COMMIT or ROLLBACK statement is issued; therefore, a single application may hold a number of internal dbspaces at one time. For example, if each SELECT needs an average of two internal dbspaces, and a certain program issues five SELECTs before issuing a COMMIT statement, then that program will hold 10 internal dbspaces. Internal dbspaces that are not in use take up minimal space (approximately 4 bytes of directory space for each page).
Allocate at least 30 internal dbspaces; more if your installation has interactive users. The exact number required depends on the number of logical units of work (LUWs) that are concurrently active and the amount of sorting and view materialization required in those LUWs. Because the number of NCUSERS is comparable to the number of concurrently active LUWs, as a guideline, in addition to the minimum of 30, you may want to provide 10 internal dbspaces for each NCUSER (see the description of the NCUSERS parameter on NCUSERS). After the database has been generated, you can always add more internal dbspaces by using the SQLADBSP EXEC. All internal dbspaces (and their storage pool assignments) are redefined on each run of this EXEC.
The physical placement of the internal dbspaces affects performance, especially when you perform a sort operation on a large table. You should place internal dbspaces in their own storage pool, and use multiple dbextents over multiple devices. There are several ways of doing this. Suppose you had 300 3380-type cylinders for internal dbspace dbextents, you could use one of these strategies:
The purpose of all these strategies is to spread input/output activity over more devices as the size of a sort increases. The strategy you adopt determines how many dbextents a sort requires. With the first strategy, a sort requiring 60 cylinders uses two dbextents. With the second and third strategies, the same sort requires three dbextents. Use a strategy that is suitable for your organization.
Sorting is done for ORDER BY, GROUP BY, join, CREATE INDEX, or UNION operations. The internal dbspaces must be large enough to hold the rows being sorted. For example, if an ORDER BY operation is requested using all the columns of an entire table, the internal dbspace must be large enough to hold the whole table. Less space is required if all the columns are not selected. During index creation, space is required only for the key columns. To calculate the required size of an internal dbspace, use the formula (KEYSIZE + 8 bytes) * ROWCOUNT. Make the internal dbspaces large enough to hold the largest table or query result you want to be able to sort. The dbspace size estimates are discussed under Appendix B, Estimating Database Storage.
The number of internal dbspaces required also depends on the planned usage of the system. Fewer are needed for preplanned application processing than for dynamic query processing, as query users usually hold dbspaces longer than do preplanned applications.
Internal dbspaces can also be stored on a virtual disk. Only use virtual disks for internal dbspaces because information on a virtual disk is lost when the database is restarted. For more information on virtual disk support, see the DB2 Server for VSE & VM Performance Tuning Handbook manual.
Sufficient space must be allocated during database generation to support your initial dbspace data storage requirements. You must define at least one dbextent for each storage pool that initially contains dbspaces. The specific amount to allocate for each storage pool depends on the following considerations:
System dbspaces are heavily used, so they should not share their storage pool (storage pool 1) with heavily used user dbspaces. Until you gain experience with your data, do not put user dbspaces in the same storage pool as system dbspaces.
You should undercommit storage pool support for the SYS0001 and SYS0002 dbspaces. If the catalog tables grow significantly, you can later allocate an additional dbextent, probably on a separate device, to avoid excessive device contention on catalog access.
Storage pool support for PUBLIC.HELPTEXT should be large enough to hold the HELP tables; PUBLIC.ISQL must be large enough to hold your initial needs for stored queries; and PUBLIC.SAMPLE should be large enough to hold the number of sample data tables needed.
Dbspaces for use primarily by end users should be supported by one or more storage pools. Public and private dbspaces can share a storage pool; however, you may want to manage space allocation differently for these two cases.
A recommended approach to storage pool support for end user data is to define more dbextent space than is needed to support your initial dbspace definitions. This approach is called overcommitting, and ensures that end user space requirements can be accommodated as existing users need more space or more users are added to the system.
If your installation plans to bill users for DASD storage space, you may want to consider separate storage pools for different user groups (or account numbers).
Note: | You can also use statistics from the SYSTEM.SYSDBSPACES catalog table to achieve this. |
Storage pool support of dbspaces for use primarily by application programs varies, depending on the nature of the data and the storage management technique. In general, consider using different storage pools for different applications, and undercommitting storage pool support for application dbspaces.
The dbspaces for applications should be defined to be larger than is believed necessary, to avoid later reorganization because of data growth. If you do this, storage pool requirements are smaller than the dbspace sizes indicate. The initial storage pool allocations should be large enough to cover initial loading of the data plus growth over the next planning period (for example, six months or a year).
Storage pool support for internal dbspaces should be undercommitted, since you probably do not need storage to support all internal dbspaces at the maximum size. As a rough estimate, the storage pool for internal dbspaces should have enough DASD space available to hold data for three internal dbspaces (at the internal dbspace size specified at database generation).
Storage space for internal dbspaces is taken from the storage pool assigned at database generation time. In general, this storage pool should not be used for system dbspaces or other heavily used dbspaces. Consider using a separate storage pool just for internal dbspaces.
For more information on storage organization techniques, see Chapter 7, Managing Database Storage.