Data Movement Utilities Guide and Reference


Exporting Data in Parallel

Exporting data in parallel reduces data transfer, and distributes the writing of the result set, as well as the generation of the formatted output, across nodes in a more effective manner than would otherwise be the case. When data is exported in parallel (by invoking multiple export operations, one for each partition of a table), it is extracted, converted on the local nodes, and then written to the local file system. In contrast, when exporting data serially (exporting through a single export operation), it is extracted in parallel and then shipped to the client, where a single process performs conversion and writes the result set to a local file system.

The db2batch command is used to monitor the performance characteristics and execution duration of SQL statements. This utility also has a parallel export function in partitioned database environments that:

There are two types of parallel export, depending on what tables are queried, and where those tables exist in the partitioned database environment:

The export utility runs a query in parallel on each partition to retrieve the data on that partition. In the case of db2batch -p s, the original SELECT query is run in parallel. In the case of db2batch -p t and db2batch -p d, a staging table is loaded with the export data, using the specified query, and a SELECT * query is run on the staging table in parallel on each partition to export the data. To export only the data that resides on a given partition, db2batch adds the predicate NODENUMBER(colname) = CURRENT NODE to the WHERE clause of the query that is run on that partition. The colname parameter must be set to the qualified or the unqualified name of a table column. The export utility uses the first column name in the original query to set this parameter.

Following are the limitations on queries used by the export utility:

If, when specifying -p s on the db2batch command, you also use the -r option to create result output files, the files on each partition will be in sorted order if you have an ORDER BY clause. If a single sorted file is your objective, merge the sorted file on each partition into one sorted file. For example, on UNIX based systems, use the command sort -m to merge the files into a single sorted file. If you are sending your output to an NFS mounted file system, the output will not be sorted, even if you specify the ORDER BY clause.


[ Top of Page | Previous Page | Next Page ]