Calculating common descriptive statistics

Use the Calculate Statistics transformer to calculate the descriptive statistics on any number of data columns from a single table.

To use the Calculate Statistics transformer step, connect the step to a Warehouse source and a Warehouse target that exist in the same database. Or, you can connect the step to a Warehouse source and specify that the step create a target table for you in the same database.

You can make changes to the step only when the step is in Development mode.

../byb.gif Authorities and privileges

To define a Calculate Statistics transformer step:

  1. Open the step notebook.

  2. Specify information for your step.

  3. Click the Parameters tab.

  4. Optional: From the Available columns list, select any columns that you want to use as grouping columns and click >. Grouping columns can contain character or numeric data.

  5. Define statistics calculations.

  6. On the Column Mappings page, map the output columns that result from your statistical calculations to columns in your target table.

    The column names for your statistical calculations are based on the data column that you select on the Parameters page and the statistic you select for it. A column is created for each statistic that is selected for a data column. For example, if your data column, Sales, has the statistics "Sum" and "Average" defined to it, the columns Sales_sum and Sales_average will be displayed on the Column Mappings page. Output columns are listed on the left side of the page, under the heading "Source Columns". Target columns from the output table linked to the step are listed on the right side of the page. Use the Column Mapping page to perform the following tasks:

    If the Parameters page produces no output columns, or if this step is not linked to a target table and you have not specified automatic generation of a default table in the Parameters page, you will not be able to use this page to map your columns. Some steps will not allow you to change the column mapping.

  7. On the Processing Options page, in the Agent Site list, select an agent site where you want your step to run. The selections in this list are agent sites that are common to the source tables, the target table, and the transformer or program that you are defining.

  8. If you want to have the option to to run your step at any time, select the Run on demand checkbox. Your step must be in test or production mode before you can run it.

  9. Optional: Select the Populate externally check box if the step is populated externally, meaning that it is invoked in some way other than by the Data Warehouse Center. The step does not have to have any other means of running in the Data Warehouse Center in order to change the mode to production.

    If Populate externally is not selected, then the step must either have a schedule, be linked to a transient table that is input to another step, or be started by another program in order to change the mode to production.

  10. In the Retry area, specify how many times you want the step to run again if it needs to be retried and the amount of time that you want to pass before the next run of the step.

  11. In the Log table field, specify a log table.

  12. Optional: In the Trace level field, specify a trace level.

  13. Click OK to save your changes and close the step notebook.

Related information

Moving and transforming data

Population type descriptions

List of steps and step subtypes

Data Warehouse Center concepts