IM Relationship Resolution Information Center, Version 4.2

Net change utility

The net change utility is a Java™-based tool that compares new, incoming data to existing data from the same source.

The utility checks incoming records against an existing data set to determine if each incoming record is:

Use the net change utility whenever you have incoming data that repeats data that you already have. For example, if one of the monthly data source feeds is a telephone directory data source, you might want to check for duplicates before loading the new file as most telephone numbers do not change every month.

By eliminating duplicate records to prevent the conversion utility or pipelines from having to process them and can reduce the overall system processing time.

The settings in the configuration file describe runtime parameters essential to comparing records. Settings that define the width of the record, record criteria, and record key must be accurate for the net change utility to be successful. For example, if the total file size is not evenly divisible by the record length plus the padding length, the net change utility displays an error and stops.

The net change utility compares an incoming record set to an existing base record set. It creates a difference file (.dif) that specifies which records represent an add, change, delete, or seen record. the utility creates a merge file (.merge) that overwrites the original base record set and will be used as the next base record set. The net change utility can then send the difference file to the UMF file conversion utility or another UMF generation utility, or if the data source provides the records in UMF format, directly to a pipeline.

Net change utility performs two kinds of comparisons:

  1. Incremental: (Default) Use an incremental comparison when you have incomplete incoming data sets: for example you are loading only A-M records today and N-Z records tomorrow. That way, the incoming data won't alter N-Z records in the base file when you are processing A-M records today. Incremental comparisons:
    • Ignore records missing in the source file, as compared to the Base File.
    • Mark no records for deletion in the difference file (.diff); Delete no records from the new base file.
    • In order to work, must not include the --do-delete switch on the command line.
  2. Full comparison: Use a full comparison when incoming data provides all the records from a data set every time, and you want to delete any records that are in the database but are missing from the input data. Full comparisons:
    • Evaluate the entire incoming data against the entire base file.
    • Mark any record missing from the source (compared to the base file). Records marked for deletion are deleted from the new base file.
    • Must be indicated on the command line by including the --do-delete switch.


Feedback

Last updated: 2009