In this lesson, you will define how the Data Warehouse Center is to move and transform data into a format for the data warehouse. First, you will define a process, which contains a series of steps in the data transformation and movement process. Next, you will specify the source tables that are to be transformed for the warehouse. Finally, you will define data transformation steps that use two different methods of transformation:
Specifically, you will define the Build Tutorial Market Dimension process, which performs the following processing:
The end result is the LOOKUP_MARKET target table.
In this exercise, you will define the process object for the Build Tutorial Market Dimension process.
To define the process object:
The Define Process notebook opens.
Build Tutorial Market Dimension
The name can be up to 80 characters long and is case-sensitive. The first character of the name must be alphanumeric. You cannot use & as the first character in English. This field is required.
Process to create the LOOKUP_MARKET table
Adding the process to the warehouse group authorizes the users in the group (in this case, you) to open and add objects to the process.
The Define Process notebook closes.
In this exercise, you will open the process so that you can graphically define the flow of data.
To open the Build Tutorial Market Dimension process:
To define the flow of data, you need to add each source that the steps transform and the target tables that result from the transformation.
In the Build Tutorial Market Dimension process, you will load the Demographics file into the target database. You need to add to the process the source file and the DEMOGRAPHICS_TARGET table for the step. The Demographics source file is part of the Tutorial File Source warehouse source, which you defined in Defining warehouse sources. The DEMOGRAPHICS_TARGET table is part of the Tutorial Targets warehouse target, which you defined in Defining warehouse targets.
To add the Demographics file:
A list of the available warehouse sources is displayed.
In the tree, you should see the following file:
X:\program files\sqllib\samples\db2sampl\dwc\demographics.txt, where X is the drive on which you installed the sample.
To add the DEMOGRAPHICS_TARGET table:
A list of the available warehouse targets is displayed.
The DEMOGRAPHICS_TARGET table is displayed in the list.
In the next part of this exercise, you need to add the source table. When you define a step that selects data from the SAMPLTBC.GEOGRAPHIES table, you can specify that the Data Warehouse Center automatically generate a target table, so you do not need to add a target table.
To add the SAMPLTBC.GEOGRAPHIES source table:
The file and tables that you selected are displayed in the Process Model window.
Now, you need to add the steps that define how the source data is to be transformed into the target data. You must define three steps:
To define the Load Demographics Data step:
Each program option in the menu represents a program group, which is a grouping of similar programs.
An icon for the step is added to the window between the demographics.txt file and the DEMOGRAPHICS_TARGET table.
The Properties notebook for the step opens.
Load Demographics Data
Loads demographics data into the warehouse.
The Properties notebook for the step closes.
The Data Link icon defines the flow of data from the source file, through transformation by a step, to the target table.
The Data Warehouse Center draws a line between the file and the step. The line indicates that the demographics.txt file contains the source data for the step.
This line indicates that the DEMOGRAPHICS_TARGET table contains the target data for the step.
The Properties notebook for the step opens.
For this exercise, you will use the default values on the page.
You can skip the Column Mapping page because the Load program does not use the column mapping to generate the target table. It generates the target table based on the source file.
The Properties notebook for the step closes.
In this exercise, you will define the Select Geographies step:
The Properties notebook for the step opens.
Select Geographies Data
Selects Geographies data from the warehouse source
The Properties notebook for the step closes.
The Data Warehouse Center draws a line that indicates that the Geographies source table contains the source data for the step.
You do not need to link a target table to the step because you will use the Data Warehouse Center to create a target table.
To select columns from the Geographies source table:
The Properties notebook for the step opens.
The Build SQL wizard opens. The Tables page is displayed.
The Build SQL wizard closes. The SQL statement that you created is
displayed on the SQL Statement page.
The Data Warehouse Center returns sample results of your SELECT statement. Compare your results to the results that you got in the sample shown in the topic, Viewing table data.
Now that you have specified which columns the SQL step will access in the source table, you are ready to create the target table. Leave the Properties notebook for the step open and go to the exercise, Creating the GEOGRAPHIES_TARGET table.
In this exercise, you will create the GEOGRAPHIES_TARGET table using the Column Mapping page in the Select Geographies Data step.
To create the GEOGRAPHIES_TARGET table:
The Source columns are displayed on the left side of the page, and the target columns list is on the right side of the page. There are no target columns in the list because you did not specify the parameters for creating the target table.
The Generate Default Table window opens.
The warehouse target is the database or file system in which the target table is created.
Because you are creating the table in the default table space, you can skip the Table space list.
GEOGRAPHIES_TARGET
The columns of the GEOGRAPHIES_TARGET table are displayed in the target columns list on the right side of the Column Mapping page.
For this exercise, you will use the default values on this page.
The Properties notebook for the step closes. The Data Warehouse Center creates a target table called GEOGRAPHIES_TARGET and links it to the step.
In this exercise, you will specify the properties for the GEOGRAPHIES_TARGET table that you created in the exercise, Creating the GEOGRAPHIES_TARGET table.
To specify the properties of the GEOGRAPHIES_TARGET table:
Geographies Target
To define the Join Market Data step:
An icon for the step is added to the window.
The Properties notebook for the step opens.
Join Market Data
Joins the Geographies table with the Demographics table
The Properties notebook for the step closes.
The Data Warehouse Center draws a line that indicates that the GEOGRAPHIES_TARGET table contains source data for the step.
Because you will specify that the Data Warehouse Center is to create the target table, you do not need to link a target table to the step.
The Data Warehouse Center draws a line that indicates that the DEMOGRAPHICS_TARGET table contains source data for the step.
To join the DEMOGRAPHICS_TARGET and GEOGRAPHIES_TARGET tables:
The Properties notebook for the step opens.
The Build SQL window opens.
You do not need the DEMOGRAPHICS_TARGET.STATE and DEMOGRAPHICS_TARGET.CITY columns because they are already defined in the GEOGRAPHIES_TARGET table. You will join the tables using the CITY column because it is unique to both tables.
The Expression Builder window opens.
case when POPULATION < 9000000 then 1 when POPULATION < 15000000 then 2 Else 3 end
A new column is added to the Selected columns list.
SIZE_ID
The Build SQL wizard draws a line between the CITY columns, which indicates that the tables are joined on that column.
The Build SQL wizard closes.
To create the LOOKUP_MARKET table:
The Generate Default Table opens.
LOOKUP_MARKET
The Generate Default Table window closes. The target columns are displayed in the target columns list on the right side of the Column Mapping page.
For this exercise, you will use the default values for the rest of this page.
The Properties notebook for the step closes, and the LOOKUP_MARKET table is displayed in the Process Model window.
To define the properties of the LOOKUP_MARKET table:
The Properties notebook for the table opens.
Market dimension data
The LOOKUP_MARKET table is one of the dimension tables that you will include in a star schema in Creating a star schema from within the Data Warehouse Center.
"Whse Market PK"
Your process now looks like this:
You defined the LOOKUP_MARKET dimension table in the Build Tutorial Market Dimension process, which contains three steps:
The sources and targets for each of the steps are shown in the following table:
Step | Sources | Target |
---|---|---|
Load Demographics Data | Demographics file | DEMOGRAPHICS_TARGET table |
Select Geographies Data | Geographies table | GEOGRAPHIES_TARGET table |
Join Market Data | Demographics Target table and Geographies Target table | LOOKUP_MARKET table |
You added the data links for each step when you defined the properties of each step. Another way you can accomplish this task is to add all the steps in the process at one time, link the steps to their sources and targets, and then define the properties of each step. The Data Warehouse Center assigns default names to the steps, which you can change in the Properties notebook for the step.
The table that you created in the previous exercise, LOOKUP_MARKET, is one of the dimension tables described in the example in Tutorial business problem. This exercise covers the steps for building the rest of the dimension tables and the fact table in the star schema.
You need to complete the steps in this exercise before you can do the following lessons:
If you want to skip this exercise, go to Testing warehouse steps.
When you define each table, you must define a new process for the table. Instead of defining a step for the process, you will copy the step that is defined in the sample. When you copy the step, the Data Warehouse Center copies the sources that the step uses and generates a target table.
To define the Product dimension table:
Build Tutorial Product Dimension
The Process Model window opens.
The Copy Step window opens.
Tutorial Select Product
Build Tutorial Product Dimension
The step and its sources are copied to the Build Tutorial Product Dimension process. The Data Warehouse Center creates the corresponding target table.
LOOKUP_PRODUCT
To rename the target table:
The Properties notebook for the table opens.
LOOKUP_PRODUCT
Repeat this procedure for the Time and Scenario dimension tables,
substituting the values in the following table where appropriate.
Dimension | Tutorial process | Sample process | Sample step to copy | New tutorial step name | Source tables | Target table | New target table name |
---|---|---|---|---|---|---|---|
Time | Build Tutorial Time Dimension | Build Sample Time Dimension | Select Time | Tutorial Select Time | TIME |
"SelectTime _T" |
LOOKUP_ TIME |
Scenario | Build Tutorial Scenario Dimension | Build Sample Scenario Dimension | Select Scenario | Tutorial Select Scenario | SCENARIO |
"SelectScen _T" |
LOOKUP_ SCENARIO |
In the previous exercise, you defined the dimension tables in the star schema. In this exercise, you will define the fact table in the star schema.
To define the fact table:
Build Tutorial Fact Table
The Process Model window opens.
The Copy Step window opens.
Tutorial Fact Table Join
Build Tutorial Fact Table
The step and its sources are copied to the Build Tutorial Fact Table process. The Data Warehouse Center generates the corresponding target table.
FACT_TABLE
To rename the target table:
The Properties notebook for the table opens.
FACT_TABLE
You now have five processes, with their associated sources, targets, and steps, defined in the Data Warehouse Center: