As identity data comes into the system for processing, the pipeline checks the quality of the data to protect the integrity of the entity database. Each incoming identity record is tested for proper Universal Message Format (UMF) construction, required values, valid data types, and configured data source codes.
As the process checks the data quality, it attempts to correct the problems, if it is possible and if the system is configured to do so. When determining whether or not to correct data quality problems, the system uses the configured data quality management (DQM) rules. DQM rules define which data quality defects on incoming identity records are acceptable for the system to correct and which defects are acceptable to leave as-is but still process the records.
To view the data quality for a particular data source, you can view or print the Load Summary report. The Quality summary section can give you helpful insights into the overall data quality for that data source or for a particular set of identity records loaded from that data source. Using this information, you can adjust your ETL process, as necessary, for a particular data source.
The standard logging and error handling logs all data quality errors and corrections, as well as errors that the system could not or did not correct. Check the system logs frequently, so that you are aware of data quality errors that were not corrected by pipeline processing. In most cases, you will need to correct the data quality errors, and then reload the corrected identity records into a pipeline for entity resolution processing.
The system can automatically add codes that are not recognized as new codes, if it is configured to do so. The UMF_EXCEPT log shows the results of new codes added by the system or records rejected and not processed, because the system did not recognize a code and was not configured to add it as new.
Code | Quality check | UMF_EXCEPT log |
---|---|---|
Addr_Type x | New code added | write to log |
Num_Type xxx | New code rejected | write to log |
In both cases, the system logs the action to the appropriate log file.