Business Intelligence Tutorial

Adding steps to the process

Now, you need to add the steps that define how the source data is to be transformed into the target data. You must define three steps:

Load Demographics Data
A DB2 program step that loads data from the Demographics file into a table in the TBC Warehouse database, which you created in Lesson 2, Creating a warehouse database.

Select Geographies Data
An SQL step that selects columns from the SAMPLTBC.GEOGRAPHIES source table.

Join Market Data
An SQL step that joins the SAMPLTBC.GEOGRAPHIES and Demographics tables and writes the joined table into the Tutorial Warehouse database.

Defining the Load Demographics Data step

In this exercise, you will define the Load Demographics Data step:

  1. From the palette (the toolbar on the left side of the window), click the DB2 Programs icon:
    Figure db2prog not displayed.

    You may need to maximize the window in order to see the DB2 Programs icon.

    Each program option in the menu represents a program group, which is a grouping of similar programs.

  2. Click DB2 UDB --> Load.
  3. Click the spot on the canvas between the demographics.txt file and the DEMOGRAPHICS_TARGET table.

    An icon for the step is added to the window between the demographics.txt file and the DEMOGRAPHICS_TARGET table.

  4. Right-click the new step.
  5. Click Properties.

    The step Properties notebook opens.

  6. In the Name field, type the name of the step:
    Load Demographics Data
    
  7. In the Administrator field, type your name as the contact for this step.
  8. In the Description field, type the description of the step:
    Loads demographics data into the warehouse.
    
  9. Click OK.

    The step Properties notebook closes.

  10. Click the Task Flow icon:
    Figure arrowcol not displayed.
  11. Click the Data Link icon:
    Figure arrowgry not displayed.

    You will use the Data Link icon to define the flow of data from the source file, through transformation by a step, to the target table.

  12. Click the middle of the Demographics source file and drag the mouse to the Load Demographics Data step.

    The Data Warehouse Center draws a line between the file and the step.

    This indicates that the Demographics source file contains the source data for the step.

  13. Click the middle of the Load Demographics Data step and drag the mouse to the DEMOGRAPHICS_TARGET table.


    A link between the Demographics source file and the Load Demographics Step

    This indicates that the DEMOGRAPHICS_TARGET table contains the target data for the step.

  14. Right-click the Load Demographics Data step.
  15. Click Properties.

    The step Properties notebook opens.

  16. Click the Parameters tab.
    The Parameters page of the Step Properties notebook
  17. From the Load mode list, click REPLACE.

    Accept the rest of the default values on the page.

    You can skip the Column Mapping page because the Load program does not use the column mapping to generate the target table. It generates the target table based on the source file.

  18. Click OK.

    The step Properties notebook closes.

Defining the Select Geographies step

In this exercise, you will define the Select Geographies step:

  1. From the palette, click the SQL icon:
    Figure sql not displayed.
  2. Click the spot on the canvas below the GEOGRAPHIES table.

    An icon for the step is added to the window.

  3. Right-click the step.
  4. Click Properties.

    The step Properties notebook opens.

  5. In the Name field, type the name of the step:
    Select Geographies Data
    
  6. In the Administrator field, type your name as the name of the contact for the step.
  7. In the Description field, type the description of the step:
    Selects Geographies data from the warehouse source
    
  8. Click OK.

    The step Properties notebook closes.

  9. Click the Task Flow icon:
    Figure arrowcol not displayed.
  10. Click the Data Link icon:
    Figure arrowgry not displayed.
  11. Click the middle of the Geographies source table and drag the mouse to the middle of the Select Geographies Data step.

    The Data Warehouse Center draws a line that indicates that the Geographies source table contains the source data for the step.

    Because you will specify that the Data Warehouse Center is to create the target table, you do not need to link a target table to the step.

  12. Right-click the Select Geographies Data step.
  13. Click Properties.

    The step Properties notebook opens.

  14. Click the SQL Statement tab.
  15. Click Build SQL to build SQL using the Build SQL wizard, which generates SQL based on the selections that you make in the wizard.

    Build SQL opens. The Tables tab is displayed.

  16. Verify that SAMPLTBC.GEOGRAPHIES is listed.
    Figure db2tu016 not displayed.
  17. Click the Columns tab.
  18. Click >> to add all the columns from the SAMPLTBC.GEOGRAPHIES table to the Selected columns list.
    The Columns page of the Build SQL wizard
  19. Click the Review tab to view the SQL statement that you just built.
  20. Click OK.

    Build SQL closes. The SQL statement that you just built is displayed on the SQL Statement page.
    The GEOGRAPHIES SELECT statement on the SQL Statement page.

  21. Click Test to test the SQL that you just generated.

    The Data Warehouse Center returns sample results of your SELECT statement. These results should be the same as the results that you got in Lesson 3, Browsing the source data when you browsed the sample data for the SAMPLTBC.GEOGRAPHIES source table. You can compare your results to the graphic in Lesson 3, Browsing the source data.

  22. Click Close to close the window.
  23. Select the Create Warehouse target table based on parameters check box.

    Selecting this check box specifies that the Data Warehouse Center is to create the target table, based on the values that are specified on the Column Mapping page.

  24. From the Warehouse target list, click Tutorial Targets.

    The warehouse target is the database or file system in which the target table is to be created.

  25. Click the Column Mapping tab.
  26. Verify that the source columns map to the correct target columns.
    Figure db2tu019 not displayed.
  27. Click the Processing Options tab.
  28. Verify that Replace is selected in the Population type list.
  29. Verify that the Run on demand check box is selected.

    Accept the rest of the default values on this page. For more information about the values on this page, see the online help.

  30. Click OK.

    The Step notebook closes. The Data Warehouse Center generates a target table called "Select_Geog_T". The target table name is in quotation marks because it is in mixed case.

  31. Change the name of the target table:

    1. Right-click the "Select_Geog_T" target table.
    2. Click Properties.

      The table Properties notebook opens.

    3. Verify that IWH is selected in the Table schema field.
    4. In the Table name field, type the new name for the table:
      GEOGRAPHIES_TARGET
      
    5. In the Description field, type a description of the table:
      Autogenerated target table for the Select Geographies Data step.
      
    6. In the Business name field, type a descriptive name for the table:
      Geographies Target
      
    7. Clear the Part of an OLAP schema check box.
    8. Click the Columns tab to view the column definitions.
    9. Click OK. The table Properties notebook closes.
  32. In the Process Model window, click View --> Arrange

Defining the Join Market Data step

To define the Join Market Data step:

  1. From the palette, click the SQL icon.
  2. Click the spot on the canvas below the two columns of steps and tables.

    An icon for the step is added to the window.

  3. Right-click the new step.
  4. Click Properties.

    The step Properties notebook opens.

  5. In the Name field, type the name of the step:
    Join Market Data
    
  6. In the Administrator field, type your name as the contact for the step.
  7. In the Description field, type the description of the step:
    Joins the Geographies table with the Demographics table
    
  8. Click OK.

    The Step notebook closes.

  9. Click the Task Flow icon:
    Figure arrowcol not displayed.
  10. Click the Data Link icon:
    Figure arrowgry not displayed.
  11. Click the middle of the GEOGRAPHIES_TARGET table and drag the mouse to the Join Market Data step.

    The Data Warehouse Center draws a line that indicates that the GEOGRAPHIES_TARGET table contains source data for the step.

    Because you will specify that the Data Warehouse Center is to create the target table, you do not need to link a target table to the step.

  12. Repeat this operation with the Demographics Target table and the Join Market Data Step.

    The Data Warehouse Center draws a line that indicates that the DEMOGRAPHICS_TARGET table contains source data for the step.

  13. Right-click the Join Market Data step.
  14. Click Properties.

    The step Properties notebook opens.

  15. Click the SQL Statement tab.
  16. Click Build SQL to have the Data Warehouse Center generate SQL. (Otherwise, you can type in your own SQL.)

    Build SQL opens.

  17. On the Tables tab, verify that the DEMOGRAPHICS_TARGET and GEOGRAPHIES_TARGET tables are listed.
  18. Click the Columns tab.
    The GEOGRAPHIES SELECT statement on the SQL Statement page.
  19. Click >> to add all the columns from the Geographies table and the Demographics table to the Selected columns list.
  20. From the Selected columns list, click DEMOGRAPHICS_TARGET.STATE.
  21. Click <.

    The DEMOGRAPHICS_TARGET.STATE column moves to the Available columns list.

  22. Click DEMOGRAPHICS_TARGET.CITY.
  23. Click <.

    The DEMOGRAPHICS_TARGET.CITY column moves to the Available columns list.

    You do not need the DEMOGRAPHICS_TARGET.STATE and DEMOGRAPHICS_TARGET.CITY columns because it is already defined in the GEOGRAPHIES_TARGET table. You will join the tables using the CITY column in both tables, which is unique.

  24. Click Add.

    The Expression Builder window opens.

  25. In the Expression field, type the following CASE statement:
    case 
    when POPULATION < 9000000   then 1
    when POPULATION < 15000000  then 2
    Else				 3
    end
    

    Figure db2tu062 not displayed.
  26. Click OK.

    A new column is added to the Selected columns list.

  27. Click the Name field of the new column and type the name of the column:
    SIZE_ID
    
  28. Press Enter.
  29. Click Move Up until the SIZE_ID column is above the POPULATION column.
  30. Click the Joins tab.
  31. In the GEOGRAPHIES_TARGET table, select the CITY column.
  32. In the DEMOGRAPHICS_TARGET table, select the CITY column.
  33. Click Join.

    Build SQL draws a line between the CITY columns, which indicates that the tables are joined on that column.


    The GEOGRAPHIES SELECT statement on the SQL Statement page.

  34. Click the Review tab to view the SQL statement that you just built.
  35. Click OK.

    Build SQL closes.

  36. On the SQL Statement tab, select the Create warehouse target table based on parameters check box.

    Selecting this check box specifies that the Data Warehouse Center is to create the target table, based on the values that are specified on the SQL Statement and Column Mapping pages.

  37. From the Warehouse target list, click Tutorial Targets.
  38. Click the Column Mapping tab.
  39. Verify that the source columns map to the correct target columns.
  40. Click the Processing Options tab.
  41. In the Population type list, verify that Replace is selected.
  42. Select the Run on demand check box to run the step on demand.

    Accept the rest of the default values on this page. For more information about the values on this page, see the online help.

  43. Click OK.

    The step Properties notebook closes.

  44. Change the name of the target table:

    1. Right-click the "Join_Market_T" target table.
    2. Click Properties.

      The table Properties notebook opens.

    3. In the Table schema field, verify that IWH is selected.
    4. In the Table name field, type the new name for the table:
      LOOKUP_MARKET
      
    5. In the Description field, type a description of the table:
      Market dimension data
      
    6. Verify that the Part of an OLAP schema check box and Dimension table radio button are selected.

      This table is one of the dimension tables that you will include in a star schema in Lesson 16, Creating a star schema from within the Data Warehouse Center.

    7. Click the Columns tab to view the column definitions.
    8. Clear the Nullable check box for the CITY_ID column because you will define this column as the primary key for the table in Lesson 11, Defining keys on target tables.
    9. Click the Warehouse Primary Key tab.
    10. Click CITY_ID in the Available columns list.
    11. Click >

      The CITY_ID column moves to the Warehouse primary key columns list.

    12. In the Constraint name field, type:
      "Whse Market PK"
      
    13. Click OK. The table Properties notebook closes.

    Figure db2tu042 not displayed.
  45. Click the Save icon in the toolbar to save the process.
    Figure wc_save not displayed.
  46. Close the Process Model window.


[ Top of Page | Previous Page | Next Page | Table of Contents ]