DB2 for OS/390

Concepts

Buffer pool size

Buffer pools

Cast function

CASTOUT

Comment_On statement

Databases

Data sets

Integrated catalog facility

Locking

Moving data

Naming conventions

Schema

SQL operators

Storage Group

Table space

Threads

Buffer pool size guidelines

DB2 handles large virtual buffer pools very efficiently. Searching in large virtual buffer pools (100MB or more) does not use any more of the processor's resources than searching in smaller pools.

For processors dedicated to DB2, start with the default buffer pool sizes. You can increase the buffer pool size as long as the number of I/Os continues to decrease, or until paging becomes a problem. If your application uses random I/Os to access the data, the number of I/Os might not decrease significantly unless the buffer pool is larger than the table, and other applications require little concurrent buffer pool usage.

Problems with Paging: When the buffer pool size requirements are excessive (real storage plus expanded storage), the oldest buffer pool pages migrate to auxiliary paging storage. Subsequent access to these pages results in a page fault. I/O must bring the data into real storage. Paging of buffer pool storage impacts DB2 performance.

Buffer pools

Buffer pools are areas of virtual storage that temporarily store pages of table spaces or indexes. When an application program accesses a row of a table, DB2 places the page containing that row in a buffer. If the requested data is already in a buffer, the application program does not have to wait for it to be retrieved from DASD. Avoiding the need to retrieve data from DASD results in faster performance.

If the row is changed, the data in the buffer must be written back to DASD eventually. But that write operation might be delayed until DB2 takes a checkpoint, or until one of the related write thresholds is reached.

The data remains in the buffer until DB2 decides to use the space for another page. Until that time, the data can be read or changed without a DASD I/O operation.

DB2 allows you to use up to 50 buffer pools that contain 4 KB buffers and 10 buffer pools that contain 32 KB buffers. You can set the size of each of those buffer pools separately when installing DB2. You can change the sizes and other characteristics of a buffer pool at any time while DB2 is running, by selecting Alter from the pop-up menu of a buffer pool.

Cast function

A cast function is a function that coverts (casts) from a distinct type (user-defined type) to its source type and back again. When a distinct type is created, DB2 generates the following cast functions:

A function to convert from the distinct type to its source data type.
A function to convert from the source data type to the distinct type.
A function to convert from a data type A to distinct type DT, where A is promotable to the source data type S of distinct type DT.

For some source data types, DB2 supports an additional function to convert from:

INTEGER to the distinct type if the source type is SMALLINT.
VARCHAR to the distinct type if the source type is CHAR.
VARGRAPHIC to the distinct type if the source type is GRAPHIC.
DOUBLE to the distinct type if the source type is REAL.

The cast function are created as if the following statements made:

	
CREATE FUNCTION source-type-name (distinct-type-name)
	RETURNS source-type-name ...
CREATE FUNCTION distinct-type-name (source-type-name)
	RETURNS distinct-type-name ...

Even if you specify a length, precision, or scale for the source data type when you create the distinct type, the name of the cast function that converts from the distinct type to the source type is simply the name of the source data type. The data type of the value that the cast function returns includes any length, precision, or scale values that you specified for the source data type.

The name of the cast function that converts from the source type to the distinct type is the name of the distinct type. The input parameter of the cast function has the same data type as the source data type, including the length, precision, and scale.

For example, a distinct type named T_SHOESIZE with schema CLAIRE is created as a VARCHAR with a length of 2. DB2 generates the following cast functions:

FUNCTION CLAIRE.VARCHAR (CLAIRE.T_SHOESIZE) RETURNS SYSIBM.VARCHAR (2)
FUNCTION CLAIRE.T_SHOESIZE (SYSIBM.VARCHAR (2)) RETURNS CLAIRE.T_SHOESIZE

Notice that function VARCHAR returns with a data type of VARCHAR(2) and that function T_SHOESIZE has an input parameter with a data type of VARCHAR(2).

You cannot explicitly drop a generated cast function. The cast functions that are generated for a distinct type are implicitly dropped when the distinct type is dropped.

CASTOUT

This is an option from the ALTER BUFFERPOOL window. The value can be either YES or NO.

When CASTOUT is YES, MVS discards the data cached in the hiperpool when a shortage of expanded storage arises. When this data is discarded, hiperspace backing expanded storage pages is released.

Selecting NO tells MVS to assign a high priority to keeping the data cached in the hiperpool. This can severely limit the availability of expanded storage to other processes on the system. So, you should select NO only for buffer pools associated with databases for which response time is critical.

COMMENT_ON statement

Statement COMMENT ON adds or replaces comments in the descriptions of various objects in the DB2 catalog at the current server.

These objects can have comments:

aliases
columns
distinct types
stored procedures
tables
triggers
user-defined functions
views

Example:

COMMENT ON TABLE dsn8610.emp IS 'reflects 1st qtr 81 reorg';

Databases

When you define a DB2 database, you give a name to an eventual collection of tables and associated indexes, as well as to the table spaces in which they reside.

One database can contain all the data associated with one application or with a group of related applications. By collecting the data into one database, you can start or stop access to all the data in one operation and grant authorization for access to all the data as a single unit.

The default database, DSNDB04, is predefined in the DB2 installation process. This database has a default buffer pool (BP0), and a default DB2 storage group (SYSDEFLT).

Data sets

All DB2 for OS/390 data is stored in VSAM data sets. You can let DB2 create and manage these data sets (using DB2 storage groups) or you can do it yourself (using VSAM access method services).

Data-set names always have multiple parts that are separated by a period. Each name part represents a level of qualification. For example, the data set name DBUSER.USER.SALES.ACCOUNTS has four parts. The first part on the left is called the high-level qualifier; the last is the low-level qualifier. (The high-level qualifier is not used in DB2 commands.) Each part can be up to 8 characters long, and the entire name can be up to 44 characters long including the separating periods. Each part must start with an alphabetic (A to Z) or national (# @ $) character. The remaining 7 characters are either alphabetic, numeric (0-9), national, or a hyphen (-).

Data set names are generally in the format catname.DSNDBx.dbname.psname.I0001.Annn
where:
catname = Integrated catalog name or alias (up to eight characters).
DSNDBx = DSNDBC (for VSAM clusters) or DSNDBD (for VSAM data components).
dbname = DB2 database name.
psname = Table space name or index name. This name must be unique within the database.
Annn = Data set number. For partitioned table spaces, the number is 001 for the first partition, 002 for the second, and so forth, up to the maximum of 254 partitions. For a nonpartitioning index on a partitioned table space that you define using the LARGE option, the maximum data set number is 128. For simple or segmented table spaces, the number is 001 for the first data set.

Here are some examples of names of a DB2 table space and index space data set on application data volume ABCD01:

DSN8D51P.DSN8S51Q.I0001.A001
DSN8D51P.XMAPRTBL.I0001.A001
...
The above data set names are shown in a short form without the high-level qualifier. The full data-set name is
ABCDB2.DSN8D51P.DSN8S51Q.I0001.A001.

The data sets associated with a DB2 subsystem all have the same, predefined high-level qualifier. The high-level qualifier in our example above is ABC followed by the DB2 subsystem ID. For example, table space data sets and other data sets used by DB2 subsystem DB24 are named:

ABCDB24.xxx.yyy.zzz

where xxx.yyy.zzz is the rest of the data set name which may consist of a different number of parts depending on the data set type.

Data set sizes are expressed as a number of 512-byte blocks. If the physical block size is different than 512 bytes, the numbers relate to the effective capacity. When you create a data set the primary space you specify is allocated on a disk volume. If this space is filled, the system allocates additional space according to the secondary space you specify. Up to 122 secondary extents are allocated if needed.

Integrated catalog facility

For both user-managed and DB2-managed data sets, you need at least one integrated catalog facility catalog, either user or master, created with the integrated catalog facility. You must identify the integrated catalog facility catalog (the "integrated catalog") when you create a storage group or when you create a table space that does not use storage groups.

If your integrated catalog facility catalog is password protected, you must supply a password when you create a storage group. That is the control or master password of the integrated catalog facility catalog that lists all dynamically created data sets in the storage group.

The integrated catalog facility passwords are used by DB2 in access method services DEFINE, ALTER, and DELETE commands. They are stored in the SYSIBM.SYSSTOGROUP table of the DB2 subsystem catalog.

If you change the integrated catalog password known to DB2 in redefining the storage group, you must execute access method services to change that password in the integrated catalog.

Locking

The hierarchy of lock sizes starting with the largest is:

table space lock
table lock (only for segmented table spaces)
page lock and row lock

In all other cases, LOCKSIZE has no effect on an SQL statement until the statement is rebound.

The lock sizes are:

ANY: Specifies the DB2 can use any lock size. Generally, DB2 will choose PAGE locks. When the number of locks acquired for the table space is greater than the maximum number of locks allowed for a table space (an installation parameter), the page locks are released and locking is set at the next higher level. If the table space is not a segmented table space, the next higher level is the table space.
TABLESPACE: The table space locks.
TABLE: The table locks. This is only applicable for a segmented table space.
PAGE: The page locks.
ROW: The row locks

Moving data

The movement of DB2 data is accomplished by the utilities RECOVER, REORG, or DSN1COPY, or by the use of non-DB2 facilities, such as DFSMSdss. Both the DB2 utilities and the non-DB2 tools can be used while DB2 is running, but the space to be moved should be stopped to prevent users from accessing it.

Naming conventions

The rules for forming a name depend on the type of the object designated by the name. In the following list, short identifier implies a maximum of 8 characters; long identifier implies a larger maximum.

alias name: A qualified or unqualified name that designates an alias, table, or view
authorization name: A short identifier that designates a set of privileges.
aux table name: A qualified or unqualified name that designates an auxiliary table. See table name for specific rules.
bpname: A name that identifies a buffer pool; the names depend on the size of the buffer pool.
catalog name: A short identifier that designates an integrated catalog facility catalog.
collection id: A long identifier that identifies a collection of packages.
column name: A qualified or unqualified name that designates a column of a table or view; unqualified form is a long identifier; qualified form is a qualifier followed by a period and a long identifier.
constraint name: A short identifier that designates a referential constraint on a table; a long identifier that designates a check constraint on a table.
correlation name: A long identifier that designates a table, a view, or individual rows of a table or view.
cursor name: A long identifier that designates an SQL cursor.
database name: A short identifier that designates a database; the identifier must start with a letter and must not include special characters.
storage group name: An unqualified identifier of up to eight characters. A DB2 storage group name must not be the same as the name of any other storage group in the DB2 catalog.

Schema

A schema is a logical grouping for user-defined functions, distinct types, triggers, and stored procedures. When an object of one of these types is created, it is assigned to one schema which is determined by the name of the object. For example, the following statement creates a distinct type T in schema C:

CREATE DISTINCT TYPE C.T ...

SQL operators

Use equal (=) to select rows for which a specified column contains a specified value.

COL = value

Use the IN predicate to select each row that has a column value equal to one of several listed values.

COL IN ('B01', 'C01', 'D01')

Use LIKE to specify a character string that is similar to the column value of rows you want to select:

Use a percent sign (%) to indicate any string of zero or more characters.
- COL LIKE '%char'
Use an underscore (_) to indicate any single character.
- COL LIKE '_char'

Use BETWEEN to select rows in which a column has a value within two limits.

COL BETWEEN value1 AND value2

Use IS NULL to retrieve rows that contain a null value in some column. A null value indicates the absence of a column value in a row. A null value is not the same as zero or all blanks.

COL IS NULL

Use the NOT keyword to select all rows except the rows identified with the search condition.

Not equal (<>)
- COL <> value
not in
- COL NOT IN (list)
not like
- COL NOT LIKE ' char'
not between
- COL NOT BETWEEN value1 AND value2
is not null
- COL IS NOT NULL
You can also use the symbols:
- <=
- <
- >
- >=

Storage Groups

A DB2 storage group is a set of volumes on direct access storage devices (DASD). The volumes hold the data sets in which tables and indexes are actually stored. The description of a storage group names the group and identifies its volumes and the VSAM catalog that records the data sets.

The unqualified identifier for a storage group can be up to eight characters. Storage group names must not be the same as any other storage group in the DB2 catalog. The DB2 catalog table SYSIBM.SYSSTOGROUP contains a row for each storage group that you define, and the table SYSIBM.SYSVOLUMES contains a row for each DASD volume. At installation, the system default storage group is defined. This storage group is named SYSDEFLT. If you do not explicitly manage your storage, then DB2 uses the default storage group to allocate space.

All volumes of a given storage group must have the same device type. But, parts of a single database can be stored in different storage groups. If the volumes in a storage group are different types or if any volume is not mounted or is otherwise invalid, then an error will occur when you try to create a table space or index. Try to assign frequently accessed objects (indexes, for instance) to fast devices and seldom-used tables to slower devices; that choice of storage groups improves performance.

After you define a storage group, DB2 stores information about it in the DB2 catalog. (This catalog is not the same as the integrated catalog facility catalog that describes DB2 VSAM data sets). The catalog table SYSIBM.SYSSTOGROUP has a row for each storage group and SYSIBM.SYSVOLUMES has a row for each volume. With the proper authorization, you can display the catalog information about DB2 storage groups by using SQL statements.

Use storage groups whenever you can, either specifically or by default. However, if you want to maintain closer control over the physical storage of your tables and indexes, you can define and manage your own VSAM data sets using VSAM access method services. Yet another possibility is to have SMS manage some or all of your DB2 data sets.

Table spaces

A table space is one or more VSAM data sets in which one or more tables are stored. Data sets that are managed by DB2 (contained in storage groups) are linear data sets (LDSs). Table spaces are divided into equal-sized units, called pages, which are written to or read from DASD in one operation.

You can optionally describe a table space as partitioned, segmented, or simple.

A partitioned table space has the available space divided into separate units of storage, called partitions, with each partition containing part of one table. Partitions can be independently assigned to separate storage groups, but the entire collection of data is logically a single table.

You should partition your table space:

to improve data availability
to improve utility performance
to improve query response time
to improve efficiency in table space and index scans

A segmented table space is intended to hold more than one table. The available space is divided into groups of pages called segments. Each segment is the same size and contains rows from only one table.

A simple table space is a table space that is neither partitioned nor segmented. A simple table space can contain more than one table, but the rows of different tables are not kept separate.

Threads

A thread is a DB2 structure that describes an application's connection, traces its progress, processes resource functions, and delimits its accessibility to DB2 resources and services. Most DB2 functions execute under a thread structure.

Threads are an important DB2 resource. When you install DB2, you choose a maximum number of active allied and database access threads that can be allocated concurrently. Choosing a good number for this is important to keep applications from queuing and to provide good response time.

When writing an application, you should know when threads are created and terminated and when they can be reused, because thread allocation can be a significant part of the cost in a short transaction.

A DB2 thread is either an allied thread, a database access thread, or a parallel task thread. Threads can be active, inactive, indoubt, or for Version 6 and later, postponed.

Distributed threads are those threads that have a connection with a remote location (active or inactive) or that had a connection with a remote location (indoubt). An allied thread and a parallel task thread can be distributed or nondistributed; a database access thread is always distributed.

An allied thread is a thread originating at the local DB2 subsystem that may access data at a remote DB2 subsystem. A database access thread is a thread accessing data at the local subsystem on behalf of a remote subsystem.

A logical unit of work identifier (LUWID) is a name that uniquely identifies a thread within a network. This name consists of a fully-qualified LU network name, an LUW instance number, and an LUW sequence number.

Commit processing can occur many times while a thread is active. When the application program or the thread terminates, an implicit COMMIT or SYNCPOINT is issued. When the thread is terminated, an accounting record is written. This record does not report transaction activity that takes place before the thread is created. If RELEASE(DEALLOCATE) is used, this is when table space locks are released, the DBD use count is decreased, and the thread storage is released.