In this lesson, you will define how the Data Warehouse Center is to move and transform data into its format for the data warehouse. First, you will define a process, which contains a series of steps in the transformation and movement process. Next, you will specify the source tables that are to be transformed for the warehouse. Finally, you will define data transformation steps that use two different methods of transformation:
Specifically, you will define the Tutorial Market process, which performs the following processing:
The end result is the LOOKUP_MARKET target table.
In this exercise, you will define the process object for the Tutorial Market process.
To define the process object:
The Define Process notebook opens.
Tutorial Market
The name can be up to 80 characters long and is case-sensitive. The first character of the name must be alphanumeric. You cannot use & as the first character in English. This field is required.
Process to create the LOOKUP_MARKET table
Adding the process to the security group authorizes the users in the group (in this case, you) to open and add objects to the process.
The Tutorial Warehouse Group moves to the Selected Security Groups list.
The Define Process notebook closes.
In this exercise, you will open the process so that you can graphically define the flow of data in the process in the next exercise.
To open the Tutorial Market process:
To define the flow of data, you need to add each source that the steps transform and the target tables that result from the transformation.
In the Tutorial Market process, you are to load the Demographics file into the target database, so you need to add the source file and the DEMOGRAPHICS_TARGET target table for the step to the process. The Demographics source file is part of the Tutorial File Source warehouse source, which you defined in Lesson 6, Defining warehouse sources. The DEMOGRAPHICS_TARGET target table is part of the Tutorial Targets warehouse target, which you defined in Lesson 7, Defining warehouse targets.
To add the Demographics file:
Click the canvas at the spot that you want to place the table. The Add Data window opens.
A list of the warehouse sources defined in the warehouse is displayed.
In the tree, you should see X:\sqllib\samples\db2sampl\dwc\demographics.txt, where X is the drive on which you installed the sample.
To add the DEMOGRAPHICS_TARGET target table:
A list of the warehouse targets defined in the warehouse is displayed.
You should see the DEMOGRAPHICS_TARGET target table in the list.
In the next part of this exercise, you need to add the GEOGRAPHIES source table. When you define a step that selects data from the GEOGRAPHIES table, you can specify that the Data Warehouse Center automatically generate a target table, so you do not need to add a target table.
To add the GEOGRAPHIES source table:
The file and tables that you selected are displayed on the Process Model window.
The last step will use the Demographics table and the Geographies table as sources, so you do not need to specify sources for the step. You can specify that the Data Warehouse Center automatically generate the resulting target table, LOOKUP_MARKET, so you do not need to specify a target table for the step.
Now, you need to add the steps that define how the source data is to be transformed into the target data. You must define three steps:
In this exercise, you will define the Load Demographics step:
Each program icon represents a program group, which is a grouping of similar programs.
The programs in a program group cascade from the program icon.
An icon for the step is added to the window.
The Step notebook opens.
Load Demographics Data
Loads demographics data into the warehouse.
The Step Properties notebook closes.
You will use the Data Link icon to define the flow of data from the source file, through transformation by a step, to the target table.
The Data Warehouse Center draws a line between the file and the step.
This indicates that the Demographics source file contains the source data for the step.
This indicates that the Demographics Table target table contains the target data for the step.
The Step notebook opens.
Accept the rest of the default values on the page.
You can skip the Column Mapping page because the Load program does not use the column mapping to generate the target table. It generates the target table based on the source file.
For more information about the values on this page, see "Loading data into a table" the online help.
The Step notebook closes.
In this exercise, you will define the Select Geographies step:
An icon for the step is added to the window.
The Step notebook opens.
Select Geographies Data
Selects Geographies data from the warehouse source
The Step notebook closes.
The Data Warehouse Center draws a line that indicates that the Geographies source table contains the source data for the step.
Because you will specify that the Data Warehouse Center is to create the target table, you do not need to link a target table to the step.
The Step notebook opens.
SQL Assist opens.
SQL Assist closes. The SQL statement that you just built is
displayed on the SQL Statement page.
The Data Warehouse Center returns sample results of your SELECT statement. These results should be the same as the results that you got in Lesson 3, Browsing the source data when you browsed the sample data for the Geographies source table.
Selecting this check box specifies that the Data Warehouse Center is to create the target table, based on the values that are specified on the Column Mapping page.
The warehouse target is the database or file system in which the target table is to be created.
Accept the rest of the default values on this page. For more information about the values on this page, see the online help.
The Step notebook closes. The Data Warehouse Center generates a target table called "Select_Geog_T". The target table name is in quotation marks because it is in mixed case.
The Table notebook opens.
GEOGRAPHIES_TARGET
Autogenerated target table for the Select Geographies Data step.
Geographies Target
To define the Join Market Data step:
An icon for the step is added to the window.
The Step notebook opens.
Join Market Data
Joins the Geographies table with the Demographics table
The Step notebook closes.
The Data Warehouse Center draws lines that indicate that the Geographies Target table and Demographics Target tables contain the source data for the step.
Because you will specify that the Data Warehouse Center is to create the target table, you do not need to link a target table to the step.
The Step notebook opens.
SQL Assist opens.
The DEMOGRAPHICS_TARGET.STATE column moves to the Available columns list.
The DEMOGRAPHICS_TARGET.CITY column moves to the Available columns list.
You do not need the DEMOGRAPHICS_TARGET.STATE and DEMOGRAPHICS_TARGET.CITY columns because it is already defined in the GEOGRAPHIES_TARGET table. You will join the tables using the CITY column in both tables, which is unique.
The Expression Builder window opens.
case when POPULATION < 9000000 then 1 when POPULATION < 15000000 then 2 Else 3 end
A new column is added to the Selected Columns list.
SIZE_ID
SQL Assist draws a line between the CITY columns, which indicates that the tables are joined on that column.
SQL Assist closes.
Selecting this check box specifies that the Data Warehouse Center is to create the target table, based on the values that are specified on the SQL Statement and Column Mapping pages.
Accept the rest of the default values on this page. For more information about the values on this page, see the online help.
The Step notebook closes.
The Table notebook opens.
LOOKUP_MARKET
Market dimension data
This table is one of the dimension tables that you will include in a star schema in Lesson 16, Creating a star schema from within the Data Warehouse Center.
You defined the LOOKUP_MARKET dimension table in the Tutorial Market process, which contains three steps:
The sources and targets for each of the steps are:
Step | Sources | Target |
---|---|---|
Load Demographics Data | Demographics file | Demographics Target table |
Select Geographies Data | Geographies table | Geographies Target table |
Join Market Data | Demographics Target table and Geographies Target table | Lookup Market table |
For this tutorial, you added the data links for each step while you were defining the properties of each step. Another way you can accomplish this task is to add all the steps in the process at one time, link the steps to their sources and targets, and then define the properties of each step. The Data Warehouse Center assigns default names to the steps that you can change in the Step notebook.
The table that you created in the previous section, LOOKUP_MARKET, is one of the dimension tables in the example in Tutorial business problem. This section covers the steps for building the rest of the dimension tables and the fact table for the star schema.
This section is optional, but if you do not complete the steps in this section, you will not be able to do the following lessons:
If you want to skip this section, go to Lesson 9, Testing warehouse steps.
When you define each table, you must define a new process for the table. Instead of defining your own step for the process, however, you will copy the step that is defined in the sample. The definition of the step is in the Data Warehouse Center that you are using. When you copy the step, the Data Warehouse Center copies the sources that the step uses and generates a target table.
To define the Product dimension table:
The Process Model window opens.
The Copy Step window opens.
Tutorial Select Product
Tutorial Product
The step and its sources are copied to the Tutorial Product process. The Data Warehouse Center generates the corresponding target table.
Repeat this procedure for the rest of the dimension tables and the fact
table.
Dimension | Tutorial Process | Sample Process | Sample Step | Tutorial Step | Warehouse Target | Source tables | Target Table | New Target Table Name |
---|---|---|---|---|---|---|---|---|
Time | Tutorial Time | Sample Time | Select Time | Tutorial Select Time | Tutorial Targets | TIME |
TARGET_ TIME |
LOOKUP_ TIME |
Scenario | Tutorial Scenario | Sample Scenario | Select Scenario | Tutorial Select Scenario | Tutorial Targets | SCENARIO |
TARGET _SCENARIO |
LOOKUP_ SCENARIO |
Fact Table | Tutorial Fact Table | Sample Fact Table | Fact Table Join | Tutorial Fact Table Join | Tutorial Targets |
SALES, INVENTORY, and PRODUCT _COSTS |
TARGET_ FACT_ TABLE |
FACT_ TABLE |
You now have five processes, and their associated sources, targets, and steps, defined in the Data Warehouse Center: