WebSphere logo Classic Federation Server for z/OS, Version 9.1
WebSphere logo Classic Replication Server for z/OS, Version 9.1
WebSphere logo Classic Data Event Publisher for z/OS, Version 9.1
WebSphere logo Data Integration Classic Connector for z/OS, Version 9.1


Performance considerations with multiple record arrays

When you map tables containing record array definitions, result sets from queries can return so many rows that performance suffers.

When a table has multiple record array definitions that you map as arrays, queries that reference the table yield large result sets. To avoid performance problems, do one of the following:

If you map a separate table for each array, federated queries can read the data. If you flatten the structure, client applications can insert, update, and delete rows.

When you map a record array definition as an array, Classic Data Architect converts the data definition to SQL columns. Each instance of the array is combined with the non-array data items to create SQL rows. For example, if the record in the database for the employee ID 123456789 contains three dependents, three distinct rows are returned for that record. The following query returns three rows in the result set:

Query:

SELECT EMP_ID, NUMBER_OF_DEPENDENTS, DEP_ID, DEP_NAME FROM
CAC.EMPL WHERE EMP_ID = '123456789';
Result set:
EMP_ID    NUMBER_OF_DEPENDENTS  DEP_ID    DEP_NAME
123456789        3              111223333 Depen1
123456789        3              222334444 Depen2
123456789        3              333445555 Depen3

Calculating the size of result sets

You can use a formula to calculate the number of rows in the result set from a query on a table that contains arrays, before you supply any filtering predicates on the WHERE clause. The number of rows in the result set is the Cartesian product of these items:

<Number of instances in each record array> * <Number of physical records>

With variable-length arrays, you can't calculate the number of rows in the result set unless you know how many instances exist for each array.

Examples

The following example demonstrates how multiple arrays can lead to performance problems arising from large result sets.

  1. An employee has these data items:
    • Four dependents
    • Two emergency contacts
    • Three assignments
  2. The database has 200 employee records

A query generates 4 * 2 * 3 = 24 rows for this employee. If all the employee records have an average Cartesian product of 24 based on the data items stored in arrays, then a query of all the employee records yields 24 * 200 = 4800 rows in the result set.

In the next example, a COBOL copybook has two OCCURS clauses that define arrays that contain data about dependents and professional organizations:
  01  EMPLOYEE-RECORD.
      ...
      05  DEPENDENTS-ARRAY OCCURS 20 TIMES.
      ...
      05  ORGS-ARRAY OCCURS 5 TIMES PIC X(10).

If you use the New Table wizard in Classic Data Architect to create a table definition for this record layout, and you map both arrays to the same table, the result set for a single employee record is the Cartesian product of DEPENDENTS-ARRAY and ORGS-ARRAY.

The query processor does not return null array instances. In this example, an employee has four dependents and two professional organizations. A query generates 4 * 2 = 8 rows for that employee.

Related concepts
Record arrays
Record array definitions for federation and change capture
Related tasks
Creating a separate table for each record array in a table definition
Related reference
Array definition examples
Related information
Mapping data for Classic federation
Mapping data for change capture


Feedback

Update icon Last updated: 2007-10-09