Replication Guide and Reference


Data replication tasks

This chapter introduces the key replication tasks that you perform at various stages in the replication process. The tasks are grouped into these major stages:

  1. Planning your replication requirements
  2. Setting up your replication environment
  3. Operating in your replication environment

After you read this chapter, go to Administration for detailed information about these tasks. Also, see Operations for specific information about using the Capture and Apply programs on particular operating systems.


Planning your replication requirements

An important step in coming up with the appropriate replication environment is determining the characteristics of your application data, who needs to access the data, and how frequently they need to access it.

You can use DB2 data replication to maintain data in more than one location and keep the various copies of it synchronized. You must determine where your source data will be coming from. You must decide whether you want all or some of the source information copied, or whether you want only changes copied, and how many copies (or targets) you need. You also need to determine where the copies will be located.

Although you cannot update the source tables and target tables synchronously, you can schedule the updates to meet the needs of your applications and your replication environment. The frequency of replication depends on how much lag time is acceptable between the time that the source is updated and the time that the targets are updated. Therefore, you must decide how synchronized the copies must be with the source and with each other before you can come up with a replication model.

After you understand your application data requirements, you can design the replication model that will help you meet those requirements. There are many facts that you need to consider when you design your model. These are some of the more important decisions that you need to make:

The replication configuration
Based on your data needs, you must decide whether you need a consolidation, distribution, update-anywhere, or occasionally connected configuration. You have the flexibility to design your environment so that it uses one of these configurations or some combination of them.

Where to locate the control server
You will get slightly better performance if you place the control tables on the same server as the Apply program instead of placing them centrally, because the Apply program frequently reads the control tables at the control server. You can have your Apply programs share a single control server so that your control information is stored centrally. The control server can be located at the source server, the target server, or any database server that the Apply program can connect to. A central control server is popular because it simplifies the administration of large networks, but it has two drawbacks: the Apply program must access the control information over the network and, if the control server goes down, all of the Apply processes are affected. However, if the source server is in a secure environment, locating the control server at the source server can improve security and let you manage and monitor replication subscriptions centrally.

The type of target tables to use
The type of target table that you use depends on your replication requirements. Each type is best suited for specific situations. For example, a replica is the only type of target table that you can use for update-anywhere replication; and a row-replica is the only type of target table you can use with DataPropagator for Microsoft Jet.

Whether to use existing target tables
You can let the administration interface create the target table for you or you can use an existing table as a target. If the existing tables are DB2 tables, the data types are supported by the DB2 data replication components. If your replication environment includes non-IBM databases, some of the data types might not map directly to the source tables that you are using.

Which columns to make available for replication
You can choose to capture only the after-image column values or both the before-image column values and the after-image column values. If you will be using the targets for auditing purposes, or if you have replica target tables, you must copy both the after-image and before-image column values.

How to capture SQL operations
You might want to capture all updates as two rows in the CD table or in the CCD table of a non-IBM source: a DELETE of the before-image column values followed by an INSERT of the after-image column values. This includes updates of columns that will be the primary key of the target, columns that will be the partitioning key of the target, or columns that are part of the WHERE clause or predicate of the subscription set. You might need to adjust the size of the CD table to accommodate this increased overhead.

The level of constraints
You must use referential constraints to enforce referential integrity only if you have target tables that are replica tables. If you have a read-only table, you do not need to set constraints at the target. The referential integrity of other types of target tables is ensured if you define your subscription sets appropriately.

Which joins to use
Joins are described in views, which in turn are defined in replication sources. For example, you might use a view to change the name of copied columns, to reference columns from related tables in the WHERE clause in your subscription member predicate, to incrementally maintain copies that are inner joins of two or more tables, or to replicate information from one table when an update is made to another table.

When you are ready to plan your replication environment, see Planning for replication for detailed planning information.


Setting up your replication environment

After you design the replication model, you must set up your replication environment. These steps are involved in setting up your replication environment:

  1. Setting up the system
  2. Defining the replication criteria
  3. Performing the initial replication

The rest of this section introduces the steps involved in setting up your environment.Setting up your replication environment contains detailed instructions on setting up your replication environment.

Setting up the system

To set up the system, you perform the following steps:

  1. Migrate from previous releases of DataPropagator products.
  2. Grant access to the proper user IDs.

Setting up the replication criteria

To set up the replication criteria, you perform the following steps:

  1. Configure the administration tool. For example, if you are using DJRA, you need to associate passwords with databases.
  2. Customize and create replication control tables.
  3. Customize change data (CD) tables. This step is optional. You can change the default name and table space of your CD tables. If you are using the DB2 Control Center, you must customize your CD tables before you define a replication source. If you are using the DJRA tool, you customize the CD tables when you define the replication source.
  4. Define replication sources. This step includes identifying the table or view from which you want data copied and the types of changes that you want captured.
  5. Define subscription sets and subscription-set members. This step includes associating the replication source with the target to which you want the changes replicated. You can define subscription sets and subscription-set members at any time prior to starting the Apply program.
  6. Configure the Capture program. This step includes enabling the source server for logging; it also includes creating and binding the Capture program package to the source server.
  7. Configure the Apply program. This step includes creating and binding the Apply program package to the source server; the target server, and the control server, it also includes creating and binding the Apply program to the target server. 11

Performing the initial replication

Important: When you set up your replication environment, you must start the Capture program and let it initialize fully before you start any Apply programs.

To perform the initial replication, you must perform the following steps in the exact order:

  1. Make sure that at least one replication source is defined.
  2. Start the Capture program. This step includes specifying invocation parameters (such as NOPRUNE, which prevents automatic pruning of the CD and UOW tables). After the Capture program is fully initialized, it will not capture any changes until the Apply program signals it to do so.
  3. If you haven't already done so, define at least one subscription set and one subscription set member.
  4. Start one or more Apply programs. This step includes specifying invocation parameters (such as LOADX, which calls ASNLOAD--an exit routine to initialize target tables). Each Apply program will perform a full-refresh copy for all subscription-set members and the Capture program will begin capturing changes for the associated replication sources. 12
Tip:Use the WARMNS option in the Capture program if you want to be able to repair any problems (such as unavailable databases or table spaces) that might prevent a warm start from occurring.

Adding to your replication environment

You probably need to add replication sources and subscription sets to your replication environment from time to time.

To add to your replication environment, you must perform the following steps in the exact order:

  1. Define the new replication source.
  2. Run the Capture reinit command, or stop the Capture program and warm start it.
  3. Define the new subscription sets and subscription set members.
  4. The Apply program will automatically recognize the new subscription set if the Apply program is already running and it uses the Apply qualifier that is associated with the new subscription set. Otherwise, you must start a new Apply program using the appropriate Apply qualifier before the Apply program can recognize the new subscription set.

Copying your replication environment

After you define your replication environment on one system (for example, a test system), you can copy the replication environment to another system (for example, a production system). You use the promote functions to reverse-engineer your tables, replication sources, and subscription sets and to create a script file with the appropriate data definition language (DDL) and data manipulation language (DML). For more information about the promote functions, see Copying your replication configuration to another system and the on-line help for the administration interface.


Operating in your replication environment

After your replication environment is up and running and updates are replicated, you need to perform periodic maintenance tasks. These include the following tasks:

Configuring the pruning of control tables
The UOW and CD tables will grow too large if the contents are not pruned regularly. You can configure your system to prune automatically, or you can prune manually. You control how frequently obsolete information will be removed from these tables. If the tables aren't pruned often enough, the table space that they're in will run out of space, which will force the Capture program to stop. If they are pruned too often or during peak times, the pruning interferes with the change capture process. You can use the optimal pruning frequency for your replication environment.

Monitoring important criteria
Many factors determine how well your replication environment performs. You can use the Replication Monitor, which is part of DJRA, to generate a report that will help you monitor the activities of the Capture and Apply components, as well as the status of the subscription sets. For example, the report contains historical information to help you determine trends about subscription latencies.

Dealing with data modification conflicts
If you are using update-anywhere replication, and you did not design your configuration to prevent update conflicts, you must handle update conflicts and rejected transactions.

Performing regular database maintenance
If you want your replication environment to run smoothly, you must regularly perform database maintenance tasks. For example, use the RUNSTATS utility against the DB2 catalog tables to collect new statistics for tables and indexes. Also use the RUNSTATS utility once after the CD and UOW tables have sufficient data in them so that the DB2 Optimizer will use indexes on them. Periodically use the REORG utility (or the RGZPFM command in AS/400) for the change data tables, the unit-of-work table, and the target tables. You must also delete rows from the Apply trail table, which contains subscription set statistics and error information.

Coordinating with DB2 utility operations
If you want to run DB2 utilities (such as REORG, RUNSTATS, BIND PACKAGE, and REVOKE) that will use the table spaces that contain the replication control tables, you must stop the Capture and Apply programs before running the utilities.

Changing your replication configuration as your business needs change
You are likely to need to modify your replication environment from time to time. Whether you add a new column to an existing source table, or drop a source table, you will need to modify your replication criteria. Also, you will need to maintain password files. For more information about modifying your replication configuration, see Modifying your replication configuration.

Troubleshooting
If you find that your replication environment is not performing as you expected, or if you can't replicate data, you can run the Replication Analyzer. The Replication Analyzer is a tool that is packaged with DB2 Universal Database and the DataJoiner Replication Administration tool. You can use the Replication Analyzer to analyze the behavior of the Capture program or the Apply program. It can answer such questions as: "why is the Capture program not capturing?" and "why is the Apply program not applying?" The Replication Analyzer can help diagnose problems, verify replication setup, and offer suggestions for performance tuning. You can also look in the Apply trail table for status information about the Apply program, or in the Capture trace table for status information about the Capture program. For details see Problem determination.

For general information about operating in a replication environment, see Operating DB2 DataPropagator. For information about operating in a particular operating system, see the appropriate chapter in Operations.


Footnotes:

11
If the Capture program and the Apply program are not on OS/390, they will automatically bind.

12
If you use non-IBM load utilities, it is recommended that you use the offline load feature in DJRA. For more information on setting up the offline load feature with DJRA, see Loading target tables offline using DJRA.


[ Top of Page | Previous Page | Next Page ]