Address hygiene and standardization is the pipeline process that normalizes and standardizes address information to correct possible errors and transpositions and to prepare the identity record for optimal entity resolution processing.
As part of the address hygiene process, the pipelines parse and standardize address information. For example, Street to St or 123-A Main Stto 123 Main St Apt A.
This pipeline process also verifies new or changed information against a global address database and standardization software provided by the IBM InfoSphere QualityStage product or by another address hygiene product, such as the Group 1 Software CODE-1 product. The chosen address hygiene product determines if the address information is correctly formatted, corrects any detected misspellings (such as misspelled street names), and corrects any missing or incorrect information (such as updating the city name to match the postal code and address).
For example, the following table shows examples of address cleansing and standardization from the original address to the corrected, standardized address.
Original address | Standardized address |
---|---|
460 Oak Street Mill Valleu, CA 94914 |
460 South Oak Street Mill Valley, CA 94914 |
4737 Simeron Drive Easton, MA 02334 |
4737 Cimmeron Drive Easton, MA 02334 |
The address hygiene and standardization pipeline process retains both the original address, as well as the corrected and enhanced address, to enhance the confidence levels of later entity resolution and relationship detection. Retaining this information also provides better historical information.