You can use user-defined programs to use the best data warehouse software for your needs, while providing a single point-of-control to administer the warehouse. The Data Warehouse Center will start an application that you define as a user-defined program at a scheduled time.
For example, if you have a data cleansing program that you want to use on your warehouse tables, you can define the data cleansing program as a user-defined program, and run a step for that program that starts after a step that populates the warehouse tables.
This chapter explains the following tasks:
A user-defined program is a program that you create or a program that is created by a vendor. You define a user-defined program to the Data Warehouse Center so that one or more steps can use the program for processing. If you are defining a stored procedure, you must ensure that the stored procedure exists in the target database. If you are defining any other type of program, you must install the program on one or more Data Warehouse Center agent sites before you define the program to the Data Warehouse Center.
After you define a user-defined program to the Data Warehouse Center, the program definition is available for use as a step in the Process Model window.
A user program group is a logical group that contains related user-defined programs. You must create a user program group before you can define a user-defined program to the Data Warehouse Center.
To define a user program group:
To define a user-defined program to the Data Warehouse Center:
schema.procedurename
On the Agent Sites page of the Program notebook, you must select the agent site on which the program is installed.
If you specified a user ID and password when you defined the agent site, the program will run as a user process. If you did not specify a user ID and password, the program will run however the warehouse agent was defined.
You can run some programs as user processes and other programs as system processes on the same workstation. To do this, define two agent sites on the workstation: one that has a user ID and password, and one that does not.
If you are defining a transformer, you must ensure that the warehouse agent that you associate with this program is associated with the database where your transformer exists. If you assign a warehouse agent to this program definition that is not associated with the target database specified on this page, you will receive a warning when you create a step that uses this program.
For example, you have the following warehouse agent, source, and
targets:
Agent | Source | Target |
Agent A | Source 1 | Target 1 |
|
| Target 2 |
Suppose Agent A is associated with Source 1 and Target 1. Now, suppose that your transformer exists on Target 2. If you select Agent A in your transformer program definition, you will receive a warning when you try to create a step. This warning is issued because Agent A is not associated with the target that contains the transformer.
To use Agent A, you must associate it with Target 2, or you must store the transformer on Target 1 and add Target 1 to the target resources definition:
If you are defining a transformer, from the Available Targets list, select the target on which the stored procedure exists and click Add. The name of the target moves to the Selected Targets list.
For more information about defining agent sites, see Defining agent sites and the online help.
You can use predefined tokens for some parameters. The Data Warehouse Center substitutes values for the tokens at run time. For example, there is a token for the database name of the target resource for a step, &TDB. If you include this token in your parameter list, the Data Warehouse Center provides the name of the database defined in the notebook of the warehouse target that contains the target table that is linked to the step. Tokens allow you to change the values that are passed depending on which step uses the program.
If you want your user-defined program to extract data from the sources that are linked to the step that uses the program, you can include the &STMT token in your parameter list. This token directs the Data Warehouse Center to pass the SELECT statement that is defined for the step to the program. The SELECT statement is passed to the program as a one command-line parameter, enclosed in double quotation marks. Any embedded quotation marks are enclosed in escape characters. If the SELECT statement is exceptionally long, it might conflict with operating system guidelines for the maximum number of parameters.
For a complete list of parameters with predefined tokens, see "Select Parameters fields and controls" in the online help.
To add a system parameter or system token to a program definition:
The parameters or tokens that you select are listed on the Parameters page of the Define User Program notebook.
Example: You are defining a user-defined program that checks for a file at regular intervals on a Windows NT workstation. You intend to use this program to find a file that another step will load into a database.
The program uses the following parameters:
You use the Warehouse target file name system parameter (&TTBN) to represent the file name. You define your own parameters for the polling interval and timeout interval.
After you define your user-defined program, you define steps that use the program for processing.
When you define a step that runs a user-defined program, you can change the parameter values that are defined for the program. If you change the parameter values for the program, the changes affect only the instance of the program that is being used in the step. The changes do not affect the original program definition.
To permanently change the parameter values of a user-defined program, you must edit the program definition.
If your user-defined program uses tokens for a source or target, you must link this step to the source or target.
To define a step that runs a user-defined program:
To edit an existing step within the Process Model window, click a process, and click Open. The Process Model window opens. Right-click the step, and click Properties.
To edit an existing step outside of the Process Model window, click a process in the Warehouse tree view. A list of objects that are contained in the process is displayed in the right frame. Right-click a step, and click Properties.
If the Populate externally check box is not selected, then the step must have a schedule, be linked to a transient table that is input to another step, or be started by another program in order for you to change the mode to production.
Example: You define a step that uses the user-defined program that you defined in the previous section. The step has no source. Because you are using the file to be found as a source for the next step in the sequence, you define the file as a target for this step. You then define a load step that uses the file as a source. The load step loads the file into a database.
After you define a step that uses a user-defined program, promote the step to test mode and run it.
To promote a step to test mode:
After the step is promoted, Test is selected when you right-click Mode.
To test a step:
The step is displayed on the Work in Progress window with a status of Populating. The processing should complete successfully. However, if the processing fails, see Testing a step for information about debugging what went wrong.
After you test the step, you can schedule it in the same way as any other step. For more information about scheduling steps, see Scheduling warehouse processes
You can write programs in any language that supports one of the following program types: executable, batch program, dynamic link library, or stored procedure.
If the user-defined program has a program type of executable, batch program, batch command file, or dynamic link library, it must reside on the agent site. The warehouse agent starts the program at the scheduled time. On Windows NT, the agent runs as a system process by default, which means that the program cannot access resources or programs that require a user ID. Also, any environment variables that the program needs to access must be system variables.
If you write user-defined programs using Object REXX for Windows, complete the following procedure to enable these programs to run under Windows NT:
DB2 CONNECT TO testdb USER db2admin USING db2pass
If the user-defined program has a program type of stored procedure, the program must reside in the warehouse database that is associated with the step that runs the program.
At run time, the Data Warehouse Center generates a command-line parameter list that it passes as input to your user-defined program. Whenever possible, test your program from the command line before using it in a step.
Example: You write a user-defined program that checks for a file at regular intervals on a Windows NT workstation. It uses the following parameters:
The program gets the parameters as shown in Figure 15:
Figure 15. Reading parameters from the command line
char *remoteFile = argv[1]; int interval = 5; // check every 5 minutes int timeout = 60; // time-out period in minutes if( (argc==3) || (argc==4) ) { if( NumStrValidation( argv[2] ) ) // get polling interval { interval = atoi( argv[2] ); if( interval == 0 ) interval = 5; // used the original one } else { trace << "Invalid value for polling interval\n"; flush(trace); return PARAMETER_ERROR; } if( argc==4 ) // get timeout period { if( NumStrValidation( argv[3] ) ) { timeout= atoi( argv[3] ); if( timeout == 0 ) timeout = 480; // used the original one } else { trace << "Invalid value for timeout period\n"; flush(trace); return PARAMETER_ERROR; } } } |
After your program runs, it must return a return code to the step that uses the program. The return code must be a positive integer. If your program does not return a return code, the step using the program might fail. The Data Warehouse Center displays the return code in the Error RC2 field of the Log Details window when the value of Error RC1 is 8410. If the value of Error RC2 is 0, then the program ran successfully without errors.
Your program can return additional status information to the Data Warehouse Center:
The warehouse agent transfers the additional status information to the warehouse server.
To transfer the additional status information to the warehouse agent, your program must create a file, called a feedback file, that contains the additional status information. The path and file name for the feedback file is specified in the VWP_LOG environment variable, which the warehouse agent sets before it calls the warehouse program. Your program must read the value of this variable, using the appropriate system call, to create the feedback file. After the user-defined program finishes running, the warehouse agent checks whether the feedback file exists. If it exists, the agent processes the file. Otherwise, the warehouse agent will do nothing. If the user-defined program cannot create the file, it should continue to run.
Your user-defined program can write the additional status information
to the feedback file in any order, but must use the following format to
identify information. Enclose each item returned within the begin tag
<tag> and end tag </tag> in the following
list. Each begin tag must be followed by its end tag; you cannot
include two begin tags in a row. For example, the following tag format
is valid:
<RC>...</RC>...<MSG>...</MSG>
The following embedded tag format is not valid:
<RC>...<MSG>...</RC>...</MSG>
You can specify the following information in the feedback file:
Figure 16 shows an example of the feedback file.
Figure 16. Example of the feedback file
<RC> 20</RC> <ROWS>2345</ROWS> <MSG>The parameter type is not correct</MSG> <COMMENT> Please supply the correct parameter type (PASSWORD NOTREQUIRED, GETPASSWORD, ENTERPASSWORD)</COMMENT> <BYTES> 123456</BYTES> <WARNING> 1</WARNING> <SQLSTATE>12345</SQLSTATE> |
The return codes and step status for the user-defined program that are displayed in the log viewer vary, depending on the following values set by the program:
Table 17 lists the possible combinations of these values and the
results that they produce.
Table 17. Feedback file conditions and results
Conditions | Results | ||||
---|---|---|---|---|---|
Step status1 | Values of Error RC1 and RC2 | ||||
The user-defined program return code is 0 | No feedback file exists2 | Successful |
RC1 = 0; RC2 = 0
| ||
A feedback file exists2 | The value of <RC> in the feedback file is 03 | <WARNING> is not set in the feedback file | Successful |
RC1 = 0; RC2 = 0
| |
The value of <WARNING> in the feedback file is 1 | Warning |
RC1 = 0; RC2 = 0
| |||
The value of <RC> in the feedback file is non-03 | <WARNING> is not set in the feedback file | Failed |
RC1 = 8410 (the user-defined program failed); RC2 = the value of <RC> in the feedback file
| ||
The value of <WARNING> in the feedback file is 1 | Warning |
RC1 = 0; RC2 = the value of <RC> in the feedback file
| |||
The user-defined program return code is nonzero | No feedback file exists2 | Failed |
RC1 = 8410 (the user-defined program failed); RC2 = the code returned by the user-defined program
| ||
A feedback file exists2 | The value of <RC> in the feedback file is 03 | <WARNING> is not set in the feedback file | Successful |
RC1 = 0; RC2 = 0
| |
The value of <WARNING> in the feedback file is 1 | Warning |
RC1 = 0; RC2 = 0
| |||
The value of <RC> in the feedback file is non-0 | <WARNING> is not set in the feedback file | Failed |
RC1 = 8410 (the user-defined program failed); RC2 = the code returned by the user-defined program
| ||
The value of <WARNING> in the feedback file is 1 | Warning |
RC1 = 0; RC2 = the value of <RC> in the feedback file
|
Notes: