Administration Guide

Creating and Populating a Table

After you determine how to organize your data into tables, the next step is to create those tables, by using the CREATE TABLE statement. The table descriptions are stored in the system catalog of the database to which you are connected.

The syntax of the CREATE TABLE statement is described in detail in the SQL Reference. For information on creating a summary table, see Creating a Summary Table. For information about naming tables, columns, and other database objects, see Appendix A, Naming Rules.

The CREATE TABLE statement gives the table a name, which is a qualified or unqualified identifier, and a definition for each of its columns. You can store each table in a separate table space, so that a table space contains only one table. If a table will be dropped and created often, it is more efficient to store it in a separate table space and then drop the table space instead of the table. You can also store many tables within a single table space. In a partitioned database environment, the table space chosen also defines the nodegroup and the database partitions on which table data is stored.

The table does not contain any data at first. To add rows of data to it, use one of the following:

Details concerning the movement of data into and out of tables is presented in Data Movement Utilities Guide and Reference.

Adding data to a table can be done without logging the change. The NOT LOGGED INITIALLY clause on the CREATE TABLE statement prevents logging the change to the table. Any changes made to the table by an INSERT, DELETE, UPDATE, CREATE INDEX, DROP INDEX, or ALTER TABLE operation in the same unit of work in which the table is created are not logged. Logging begins in subsequent units of work.

A table consists of one or more column definitions. A maximum of 500 columns can be defined for a table. Columns represent the attributes of an entity. The values in any column are all the same type of information. Refer to the SQL Reference for more information.

Note:The maximum of 500 columns is true when using a 4 KB page size. The maximum is 1012 columns when using an 8 KB, 16 KB, or 32 KB page size.

A column definition includes a column name, data type, and any necessary null attribute, or default value (optionally chosen by the user).

The column name describes the information contained in the column and should be something that will be easily recognizable. It must be unique within the table; however, the same name can be used in other tables. See Object Names for information about naming rules.

The data type of a column indicates the length of the values in it and the kind of data that is valid for it. The database manager uses character string, numeric, date, time and large object data types. Graphic string data types are only available for database environments using multi-byte character sets. In addition, columns can be defined with user-defined distinct types, which are discussed in Creating a User-Defined Type (UDT).

The default attribute specification indicates what value is to be used if no value is provided. The default value can be specified, or a system-defined default value used. Default values may be specified for columns with, and without, the null attribute specification.

The null attribute specification indicates whether or not a column can contain null values.

To create a table using the Control Center:
  1. Expand the object tree until you see the Tables folder.
  2. Right-click the Tables folder, and select Create --> Tables Using Wizard from the pop-up menu.
  3. Follow the steps in the wizard to complete your tasks.

To create a table using the command line, enter:

   CREATE TABLE <NAME>
      (<column_name>  <data_type>  <null_attribute>)
      IN <TABLE_SPACE_NAME)

The following is an example of a CREATE TABLE statement that creates the EMPLOYEE table in the RESOURCE table space. This table is defined in the sample database:

   CREATE TABLE EMPLOYEE
      (EMPNO     CHAR(6)     NOT NULL PRIMARY KEY,
       FIRSTNME  VARCHAR(12) NOT NULL,
       MIDINIT   CHAR(1)     NOT NULL WITH DEFAULT,
       LASTNAME  VARCHAR(15) NOT NULL,
       WORKDEPT  CHAR(3),
       PHONENO   CHAR(4),
       PHOTO     BLOB(10M)   NOT NULL)
   IN RESOURCE

When creating a table, you can choose to have the columns of the table based on the attributes of a structured type. Such a table is called a "typed table".

A typed table can be defined to inherit some of its columns from another typed table. Such a table is called a "subtable", and the table from which it inherits is called its "supertable". The combination of a typed table and all its subtables is called a "table hierarchy". The topmost table in the table hierarchy (the one with no supertable) is called the "root table" of the hierarchy.

The following sections build on the previous example to cover other options you should consider:

You can also create a table that is defined based on the result of a query. This type of table is called a summary table. For more information, see Creating a Summary Table.

Large Object (LOB) Column Considerations

Before creating a table that contains large object columns, you need to make the following decisions:

  1. Do you want to log changes to LOB columns?

    If you do not want to log these changes, you must turn logging off by specifying the NOT LOGGED clause when you create the table:

       CREATE TABLE EMPLOYEE
          (EMPNO     CHAR(6)     NOT NULL PRIMARY KEY,
           FIRSTNME  VARCHAR(12) NOT NULL,
           MIDINIT   CHAR(1)     NOT NULL WITH DEFAULT,
           LASTNAME  VARCHAR(15) NOT NULL,
           WORKDEPT  CHAR(3),
           PHONENO   CHAR(4),
           PHOTO     BLOB(10M)   NOT NULL  NOT LOGGED)
       IN RESOURCE
    

    If the LOB column is larger than 1 GB, logging must be turned off. (As a rule of thumb, you may not want to log LOB columns larger than 10 MB.) As with other options specified on a column definition, the only way to change the logging option is to re-create the table.

    Even if you choose not to log changes, LOB columns are shadowed to allow changes to be rolled back, whether the roll back is the result of a system generated error, or an application request. Shadowing is a recovery technique where current storage page contents are never overwritten. That is, old, unmodified pages are kept as "shadow" copies. These copies are discarded when they are no longer needed to support a transaction rollback.
    Note:When recovering a database using the RESTORE and ROLLFORWARD commands, LOB data that was "NOT LOGGED"and was written since the last backup will be replaced by binary zeros.

  2. Do you want to minimize the space required for the LOB column?

    You can make the LOB column as small as possible using the COMPACT clause on the CREATE TABLE statement. For example:

       CREATE TABLE EMPLOYEE
          (EMPNO     CHAR(6)     NOT NULL PRIMARY KEY,
           FIRSTNME  VARCHAR(12) NOT NULL,
           MIDINIT   CHAR(1)     NOT NULL WITH DEFAULT,
           LASTNAME  VARCHAR(15) NOT NULL,
           WORKDEPT  CHAR(3),
           PHONENO   CHAR(4),
           PHOTO     BLOB(10M)   NOT NULL  NOT LOGGED  COMPACT)
       IN RESOURCE
    

    There is a performance cost when appending to a table with a compact LOB column, particularly if the size of LOB values are increased (because of storage adjustments that must be made).

    On platforms such as OS/2 where sparse file allocation is not supported and where LOBs are placed in SMS table spaces, consider using the COMPACT clause. Sparse file allocation has to do with how physical disk space is used by an operating system. An operating system that supports sparse file allocation does not use as much physical disk space to store LOBs as compared to an operating system not supporting sparse file allocation. The COMPACT option allows for even greater physical disk space "savings" regardless of the support of sparse file allocation. Because you can get some physical disk space savings when using COMPACT, you should consider using COMPACT if your operating system does not support sparse file allocation.
    Note:DB2 system catalogs use LOB columns and may take up more space than in previous versions.

  3. Do you want better performance for LOB columns, including those LOB columns in the DB2 system catalogs?

    There are large object (LOB) columns in the catalog tables. LOB data is not kept in the buffer pool with other data but is read from disk each time it is needed. Reading from disk slows down the performance of DB2 where the LOB columns of the catalogs are involved. Since a file system usually has its own place for storing (or caching) data, using a SMS table space, or a DMS table space built on file containers, make avoidance of I/O possible when the LOB has previously been referenced.

Defining Constraints

This section discusses how to define constraints:

For more information on constraints, see Planning for Constraints Enforcement and refer to the SQL Reference.

Defining a Unique Constraint

Unique constraints ensure that every value in the specified key is unique. A table can have any number of unique constraints, with at most one unique constraint defined as a primary key.

You define a unique constraint with the UNIQUE clause in the CREATE TABLE or ALTER TABLE statements. The unique key can consist of more than one column. More than one unique constraint is allowed on a table. However, a unique constraint may not be defined on a subtable.

Once established, the unique constraint is enforced automatically by the database manager when an INSERT or UPDATE statement modifies the data in the table. The unique constraint is enforced through a unique index.

When a unique constraint is defined in an ALTER TABLE statement and an index exists on the same set of columns of that unique key, that index becomes the unique index and is used by the constraint.

You can take any one unique constraint and use it as the primary key. The primary key can be used as the parent key in a referential constraint (along with other unique constraints). There can be only one primary key per table. You define a primary key with the PRIMARY KEY clause in the CREATE TABLE or ALTER TABLE statement. The primary key can consist of more than one column.

A primary index forces the value of the primary key to be unique. When a table is created with a primary key, the database manager creates a primary index on that key.

Some performance tips for indexes used as unique constraints include:

Defining Referential Constraints

Referential integrity is imposed by adding referential constraints to table and column definitions. Referential constraints are established with the FOREIGN KEY clause, and the REFERENCES clause in the CREATE TABLE or ALTER TABLE statements. Refer to the SQL Reference for more information on the effects of a referential constraint on typed tables or to a parent table that is a typed table.

The identification of foreign keys enforces constraints on the values within the rows of a table or between the rows of two tables. The database manager checks the constraints specified in a table definition and maintains the relationships accordingly. The goal is to maintain integrity whenever one database object references another.

For example, primary and foreign keys each have a department number column. For the EMPLOYEE table, the column name is WORKDEPT, and for the DEPARTMENT table, the name is DEPTNO. The relationship between these two tables is defined by the following constraints:

The SQL statement defining the parent table, DEPARTMENT, is:

   CREATE TABLE DEPARTMENT
      (DEPTNO    CHAR(3)     NOT NULL,
       DEPTNAME  VARCHAR(29) NOT NULL,
       MGRNO     CHAR(6),
       ADMRDEPT  CHAR(3)     NOT NULL,
       LOCATION  CHAR(16),
          PRIMARY KEY (DEPTNO))
   IN RESOURCE

The SQL statement defining the dependent table, EMPLOYEE, is:

   CREATE TABLE EMPLOYEE
      (EMPNO     CHAR(6)     NOT NULL PRIMARY KEY,
       FIRSTNME  VARCHAR(12) NOT NULL,
       LASTNAME  VARCHAR(15) NOT NULL,
       WORKDEPT  CHAR(3),
       PHONENO   CHAR(4),
       PHOTO     BLOB(10m)   NOT NULL,
          FOREIGN KEY DEPT (WORKDEPT)
          REFERENCES DEPARTMENT ON DELETE NO ACTION)
   IN RESOURCE

By specifying the DEPTNO column as the primary key of the DEPARTMENT table and WORKDEPT as the foreign key of the EMPLOYEE table, you are defining a referential constraint on the WORKDEPT values. This constraint enforces referential integrity between the values of the two tables. In this case, any employees that are added to the EMPLOYEE table must have a department number that can be found in the DEPARTMENT table.

The delete rule for the referential constraint in the employee table is NO ACTION, which means that a department cannot be deleted from the DEPARTMENT table if there are any employees in that department.

Although the previous examples use the CREATE TABLE statement to add a referential constraint, the ALTER TABLE statement can also be used. See Modifying a Table in Both Structure and Content.

Another example: The same table definitions are used as those in the previous example. Also, the DEPARTMENT table is created before the EMPLOYEE table. Each department has a manager, and that manager is listed in the EMPLOYEE table. MGRNO of the DEPARTMENT table is actually a foreign key of the EMPLOYEE table. Because of this referential cycle, this constraint poses a slight problem. You could add a foreign key later (see Adding Primary and Foreign Keys). You could also use the CREATE SCHEMA statement to create both the EMPLOYEE and DEPARTMENT tables at the same time (see the example in the SQL Reference).

FOREIGN KEY Clause

A foreign key references a primary key or a unique key in the same or another table. A foreign key assignment indicates that referential integrity is to be maintained according to the specified referential constraints. You define a foreign key with the FOREIGN KEY clause in the CREATE TABLE or ALTER TABLE statement.

The number of columns in the foreign key must be equal to the number of columns in the corresponding primary or unique constraint (called a parent key) of the parent table. In addition, corresponding parts of the key column definitions must have the same data types and lengths. The foreign key can be assigned a constraint name. If you do not assign a name, one is automatically assigned. For ease of use, it is recommended that you assign a constraint name and do not use the system-generated name.

The value of a composite foreign key matches the value of a parent key if the value of each column of the foreign key is equal to the value of the corresponding column of the parent key. A foreign key containing null values cannot match the values of a parent key, since a parent key by definition can have no null values. However, a null foreign key value is always valid, regardless of the value of any of its non-null parts.

The following rules apply to foreign key definitions:

REFERENCES Clause

The REFERENCES clause identifies the parent table in a relationship, and defines the necessary constraints. You can include it in a column definition or as a separate clause accompanying the FOREIGN KEY clause, in either the CREATE TABLE or ALTER TABLE statements.

If you specify the REFERENCES clause as a column constraint, an implicit column list is composed of the column name or names that are listed. Remember that multiple columns can have separate REFERENCES clauses, and that a single column can have more than one.

Included in the REFERENCES clause is the delete rule. In our example, the ON DELETE NO ACTION rule is used, which states that no department can be deleted if there are employees assigned to it. Other delete rules include ON DELETE CASCADE, ON DELETE SET NULL, and ON DELETE RESTRICT. See DELETE Rules.

Implications for Utility Operations

The LOAD utility will turn off constraint checking for self-referencing and dependent tables, placing these tables into check pending state. After the LOAD utility has completed, you will need to turn on the constraint checking for all tables for which it was turned off. For example, if the DEPARTMENT and EMPLOYEE tables are the only tables that have been placed in check pending state, you can execute the following command:

   SET INTEGRITY FOR DEPARTMENT, EMPLOYEE IMMEDIATE CHECKED

The IMPORT utility is affected by referential constraints in the following ways:

Defining a Table Check Constraint

A table check constraint specifies a search condition that is enforced for each row of the table on which the table check constraint is defined. You create a table check constraint on a table by associating a check-constraint definition with the table when the table is created or altered. This constraint is automatically activated when an INSERT or UPDATE statement modifies the data in the table. A table check constraint has no effect on a DELETE or SELECT statement. A check constraint can be associated with a typed table.

A constraint name cannot be the same as any other constraint specified within the same CREATE TABLE statement. If you do not specify a constraint name, the system generates an 18-character unique identifier for the constraint.

A table check constraint is used to enforce data integrity rules not covered by key uniqueness or a referential integrity constraint. In some cases, a table check constraint can be used to implement domain checking. The following constraint issued on the CREATE TABLE statement ensures that the start date for every activity is not after the end date for the same activity:

   CREATE TABLE EMP_ACT
      (EMPNO      CHAR(6)      NOT NULL,
       PROJNO     CHAR(6)      NOT NULL,
       ACTNO      SMALLINT     NOT NULL,
       EMPTIME    DECIMAL(5,2),
       EMSTDATE   DATE,
       EMENDATE   DATE,
       CONSTRAINT ACTDATES CHECK(EMSTDATE <= EMENDATE) )
   IN RESOURCE

Although the previous example uses the CREATE TABLE statement to add a table check constraint, the ALTER TABLE statement can also be used. See Modifying a Table in Both Structure and Content.

Defining a Generated Column on a New Table

A generated column is defined in a base table where the stored value is computed using an expression, rather than being specified through an insert or update operation. When creating a table where it is known that certain expressions or predicates will be used all the time, you can add one or more generated columns to that table. By using a generated column there is opportunity for performance improvements when querying the table data.

For example, there are two ways in which the evaluation of expressions can be costly when performance is important:

  1. The evaluation of the expression must be done many times during a query.
  2. The computation is complex.

To improve the performance of the query, you can define an additional column that would contain the results of the expression. Then, when issuing a query that includes the same expression, the generated column can be used directly; or, the query rewrite component of the optimizer can replace the expression with the generated column.

It is also possible to create a non-unique index on a generated column.

Where queries involve the joining of data from two or more tables, the addition of a generated column can allow the optimizer a choice of possibly better join strategies.

The following is an example of defining a generated column on the CREATE TABLE statement:

   CREATE TABLE t1 (c1 INT,
                    c2 DOUBLE,
                    c3 DOUBLE GENERATED ALWAYS AS (c1 + c2)
                    c4 GENERATED ALWAYS AS
                       (CASE WHEN c1 > c2 THEN 1 ELSE NULL END))

After creating this table, indexes can be created using the generated columns. For example,

   CREATE INDEX i1 ON t1(c4)

Queries can take advantage of the generated columns. For example,

   SELECT COUNT(*) FROM t1 WHERE c1 > c2

can be written as

   SELECT COUNT(*) FROM t1 WHERE c4 IS NOT NULL

Another example:

   SELECT c1 + c2 FROM t1 WHERE (c1 + c2) * c1 > 100

can be written as

   SELECT c3 FROM t1 WHERE c3 * c1 > 100

Generated columns will be used to improve performance of queries. As a result, generated columns will likely be added after the table has been created and populated. See Creating and Populating a Table for more information.

Creating a User-defined Temporary Table

You use the DECLARE GLOBAL TEMPORARY TABLE statement to define a temporary table. The statement is used from within an application. The user-defined temporary table only persists until the application disconnects from the database.

The description of this table does not appear in the system catalog making it not persistent for, and not able to be shared with, other applications.

When the application using this table terminates or disconnects from the database, any data in the table is deleted and the table is implicitly dropped.

An example of how you can define a temporary table as follows:

   DECLARE GLOBAL TEMPORARY TABLE gbl_temp
      LIKE empltabl
      ON COMMIT DELETE ROWS
      NOT LOGGED
      IN usr_tbsp

This statement creates a user temporary table called gbl_temp. The user temporary table is defined with columns that have exactly the same name and description as the columns of the empltabl. The implicit definition only includes the column name, datatype, nullability characteristic, and column default value attributes. All other column attributes including unique constraints, foreign key constraints, triggers, and indexes are not defined. When a COMMIT operation is performed, all data in the table is deleted if no WITH HOLD cursor is open on the table. Changes made to the user temporary table are not logged. The user temporary table is placed in the specified user temporary table space. This table space must exist or the declaration of this table will fail.

Refer to the SQL Reference for additional information on the DECLARE GLOBAL TEMPORARY TABLE statement.
Note:A user-defined temporary table does not support:
  • LOB-type columns (or a distinct-type column based on a LOB)
  • User-defined type columns
  • LONG VARCHAR columns
  • DATALINK columns

Defining an Identity Column on a New Table

An identity column provides a way for DB2 to automatically generate a guaranteed-unique numeric value for each row that is added to the table. When creating a table where you know that you need to uniquely identify each row that will be added to the table, you can add an identity column to the table.

Once created, you cannot alter the table description to include an identity column.

It is the AS IDENTITY clause on the CREATE TABLE statement that allows for the specification of the identity column.

The following is an example of defining an identity column on the CREATE TABLE statement:

   CREATE TABLE table (col1 INT,
                       col2 DOUBLE,
                       col3 INT NOT NULL GENERATED ALWAYS AS IDENTITY
                                         (START WITH 100, INCREMENT BY 5))

In this example the third column is the identity column. You can also specify the value used in the column to uniquely identify each row when added. Here the first row entered has the value of "100" placed in the column; every subsequent row added to the table has the associated value increased by five.

Some additional example uses of an identity column are an order number, an employee number, a stock number, or an incident number. The values for an identity column can be generated by DB2: ALWAYS or BY DEFAULT.

An identity column defined as GENERATED ALWAYS is guaranteed to be unique. The values used are always generated by DB2. Applications are not allowed to provide an explicit value. An identity column defined as GENERATED BY DEFAULT gives applications a way to explicitly provide a value for the identity column. If the application does not provide a value, then DB2 will generate one. Since the application controls the value, DB2 cannot guarantee the uniqueness of the value. The GENERATED BY DEFAULT clause is meant for use for data propagation where the intent is to copy the contents of an existing table; or, for the unload and reloading of a table.
Note:Identity columns are not currently supported in a partitioned database environment.

Refer to SQL Reference for additional information on defining an identity column on a new table.

Creating a Typed Table

You can create a typed table using a variant of the CREATE TABLE statement. Refer to the Application Development Guide for all the information you need on typed tables.

Populating a Typed Table

You can populate a typed table after creating the structured types and then creating the corresponding tables and subtables. Refer to the Application Development Guide for all the information you need on typed tables.

Hierarchy Table

A hierarchy table is a table that is associated with the implementation of a typed table hierarchy. It is created at the same time as the root table of the hierarchy. Refer to the Application Development Guide for all the information you need on hierarchy tables.

Creating a Table in Multiple Table Spaces

Table data can be stored in the same table space as the index for the table, and any long column data associated with the table. You can also place the index in a separate table space, and place any long column data in a separate table space, apart from the table space for the rest of the table data. All table spaces must exist before the CREATE TABLE statement is run. The separation of the parts of the table can only be done using DMS table spaces.

To create a table in multiple table spaces using the Control Center:
  1. Expand the object tree until you see the Tables folder.
  2. Right-click the Tables folder, and select Create --> Tables Using Wizard from the pop-up menu.
  3. Type the table name and click Next.
  4. Select columns for your table.
  5. On the Table space page, click Use separate index space and Use separate long space, specify the information, and click Finish.

To create a table in multiple table spaces using the command line, enter:

   CREATE TABLE <name>
      (<column_name>  <data_type>  <null_attribute>)
       IN <table_space_name>
       INDEX IN <index_space_name>
       LONG  IN <long_space_name>

The following example shows how the EMP_PHOTO table could be created to store the different parts of the table in different table spaces:

   CREATE TABLE EMP_PHOTO
      (EMPNO        CHAR(6)      NOT NULL,
       PHOTO_FORMAT VARCHAR(10)  NOT NULL,
       PICTURE      BLOB(100K) )
   IN RESOURCE
   INDEX IN RESOURCE_INDEXES
   LONG  IN RESOURCE_PHOTO

This example will cause the EMP_PHOTO data to be stored as follows:

See Table Space Design Considerations for additional considerations on the use of multiple DMS table spaces for a single table.

Refer to the SQL Reference for more information.

Creating a Table in a Partitioned Database

Before creating a table that will be physically divided or partitioned, you need to consider the following:

One additional option exists when creating a table in a partitioned database environment: the partitioning key. A partitioning key is a key that is part of the definition of a table. It determines the partition on which each row of data is stored.

It is important to select an appropriate partitioning key because it cannot be changed later. Furthermore, any unique indexes (and therefore unique or primary keys) must be defined as a superset of the partitioning key. That is, if a partitioning key is defined, unique keys and primary keys must include all of the same columns as the partitioning key (they may have more columns).

If you do not specify the partitioning key explicitly, the following defaults are used. Ensure that the default partitioning key is appropriate.

Following is an example:

   CREATE TABLE MIXREC (MIX_CNTL INTEGER NOT NULL,
                        MIX_DESC CHAR(20) NOT NULL,
                        MIX_CHR  CHAR(9) NOT NULL,
                        MIX_INT INTEGER NOT NULL,
                        MIX_INTS SMALLINT NOT NULL,
                        MIX_DEC DECIMAL NOT NULL,
                        MIX_FLT FLOAT NOT NULL,
                        MIX_DATE DATE NOT NULL,
                        MIX_TIME TIME NOT NULL,
                        MIX_TMSTMP TIMESTAMP NOT NULL)
                        IN MIXTS12
                        PARTITIONING KEY (MIX_INT) USING HASHING

In the preceding example, the table space is MIXTS12 and the partitioning key is MIX_INT. If the partitioning key is not specified explicitly, it is MIX_CNTL. (If no primary key is specified and no partitioning key is defined, the partitioning key is the first non-long column in the list.)

A row of a table, and all information about that row, always resides on the same database partition.

The size limit for one partition of a table is 64 GB, or the available disk space, whichever is smaller. (This assumes a 4 KB page size for the table space.) The size of the table can be as large as 64 GB (or the available disk space) times the number of database partitions. If the page size for the table space is 8 KB, the size of the table can be as large as 128 GB (or the available disk space) times the number of database partitions. If the page size for the table space is 16 KB, the size of the table can be as large as 256 GB (or the available disk space) times the number of database partitions. If the page size for the table space is 32 KB, the size of the table can be as large as 512 GB (or the available disk space) times the number of database partitions.


[ Top of Page | Previous Page | Next Page ]