This glossary contains definitions for the Relationship Resolution terminology.
A
- account
- See identity.
- acquisition file processor (AFP)
- See UMF file conversion utility.
- acquisition programs
- The tools and programs that acquire data, convert it to a recognized format
(if necessary), and submit it to the pipeline for processing. These programs
can be configured to run in batch mode or in real-time mode.
- address hygiene
- The process that normalizes and standardizes address information to correct
possible errors and transpositions and to enable optimal matching and linking
between entities. Additional address correction software can be used to enhance
the address hygiene process.
- AFP (acquisition file processor)
- See UMF file conversion utility.
- alert
- A message or other indication that signals an event has occurred. See
also role alert and attribute alert.
- application monitor
- The component that monitors pipeline errors, status, and statistics and
that sends routing information to pipelines. The application monitor is installed
with the pipeline, however it can be installed separately as well.
- application monitor database
- The database that stores the routing and monitoring information for the
pipelines. See also entity database and Configuration Console
database.
- attribute
- A characteristic or trait that describes a person, organization, place,
or item. See also entity and identity.
- attribute alert
- An alert that identifies entities that match a specified set of attributes.
- attribute alert generator
- A set of attributes, which the customer defines, that the pipeline uses
to compare with the incoming attributes of identities. If the attributes match,
the pipeline generates an attribute alert.
- attribute type
- A specific classification of an attribute. The supported attribute types
are characteristics, numbers, names, addresses, and e-mail addresses. See
also attribute.
C
- candidate builder
- The configured set of attributes that are used to build the candidate
list. See also candidate list and attributes.
- candidate list
- The list of entities that have the potential to match the incoming identity.
The candidate list is built by retrieving those entities that share certain
attributes (such as numbers and addresses) with the incoming identity, based
on the attributes that are specified in the candidate builder. During the
re-resolve process, the list of entities are matched to the new composite
entity.
- candidate threshold
- The minimum score at which a particular attribute value must match between
the incoming identity and an existing entity to satisfy the resolution rule.
See also resolution rule.
- characteristic
- A user-defined trait or property that is associated with an identity that
is not commonly expressed as a name, number, address, or e-mail. This attribute
allows users to extend the product by defining customizable entity attributes
that are meaningful to their data sources. See also attribute and identity.
- cleansed pipeline search (CPS)
- Synonym for search.
- CME Admin (Central Messaging Engine Administrator)
- See application monitor.
- CME Admin node
- See Configuration Console.
- Configuration Console
- The graphical user interface that you use to configure the system, monitor
data and route messages, and view reports.
- Configuration Console database
- The database that stores the configuration settings for the Configuration
Console. See also application monitor database and entity
database.
- configuration utility
- A utility that is used after installation to modify configuration settings
for database configuration and WebSphere Application Server logging. It can
also be used to apply fixes to the Configuration Console and the Visualizer.
- conflict
- See role alert.
- conflict rules
- See role alert rules.
D
- daemon
- A program that runs unattended to perform continuous or periodic functions,
such as network control.
- data mapping
- A defined mapping between the data in a UMF file and the corresponding
tables and table columns in the entity database. A data mapping must exist
to successfully load data into the entity database.
- data quality management (DQM)
- The pipeline process that checks the data for required values, valid data
types, and valid codes, and also corrects the data by providing default values,
formatting numbers and dates, and adding new codes, if that has been configured.
Data quality management includes address hygiene and name standardization
processing. See also address hygiene and name standardization.
- data source
- The data that contains the identities that you want to load into the entity
database. Data sources contain identifying data (unique, personal identifiers
for an identity) and non-identifying data (other attributes and data points
for an identity). The identity records in the data source must be exported
as Universal Message Format (UMF) before they can be loaded into the entity
database. Examples of data sources include, but are not limited to, employee
databases, watch lists, vendor lists, customer lists, and so on.
- data source account
- When referring to the specific instance in a data source, see identity.
- When referring to the unique ID for a specific instance in a data source,
see external ID.
- data source code
- The user-defined identifier for the data source.
- data source record
- See identity.
- data source reference
- See external reference.
- degrees of separation
- A measurement of the relationship between two entities. The measurement
is a positive integer greater than or equal to zero that defines the minimum
number of entities involved in a chain of relationships not including the
root entity. For example, if two entities are related, those entities are
1 degree separated, and have a 1-degree relationship. See also relationship.
- detach
- The process of de-coupling an inbound identity from an entity and verifying
again that it should still be associated with that entity.
- disclosure
- A user-defined relationship between two identities in two separate entities.
- document types
- See UMF input documents.
- DQM
- See data quality management.
- DQM rule
- A rule that defines how data is processed by the data quality management
(DQM) processes and DQM functions. DQM rules apply to specific UMF segments.
When you define a DQM rule, you define the DQM function, the specific parameters,
and the order in which the rule is processed. See also data quality
management and UMF segment.
E
- element
- (1) See attribute and attribute type.
- (2) In markup languages such as SGML, XML, and HTML, a basic unit consisting
of a start tag, end tag, associated attributes and their values, and any text
that is contained between the two.
- entity
- A collection of one or more identities that represent the same person,
organization, place, or item. See also identity.
- entity database
- The database that stores identities, entities, and data that is used for
relationships, resolutions, and alerts. The entity database might also store
configuration, routing, and monitoring settings, if users did not choose to
create a separate application monitor database and Configuration Console database.
See also application monitor database and Configuration
Console database.
- entity model
- The set of user-defined attributes that define an entity in the system.
See also attributes and entity.
- entity resolution
- The process that compares one or more identities and determines if they
represent the same entity or two different entities. If two identities are
determined to represent the same person, organization, place, or item, they
are resolved into a single entity; otherwise, they are unresolved into two
separate entities.
- event
- An event represents information about something that happened in the business
domain, such as "a customer opens an account" or "a customer wires money".
In IBM Relationship Resolution Event Manager, events contain attributes, which
are based on their corresponding event types.
- event alert
- An event alert occurs when a collection of complex events meets specified
criteria over a specified life span. Event alerts are based on business rules
that are defined in a complex event processor, and can indicate situations
of interest, such as two or more purchase transactions of more than 10,000
U.S. dollars occurred in the last hour at locations 240 kilometers from each
other.
- event type
- An event type categorizes events and defines the unit of measure for the
value associated with events in IBM Relationship Resolution Event Manager.
Examples of event types include wire transfer, account opening, or credit
card transaction.
- Event types are required for event processing, because the user-defined
business rules that the event processor uses call a specific event type. If
the event type does not exist, the event processor cannot process the event.
- external ID
- A unique key that identifies an identity in the data source. An external
ID typically consists of a unique ID for the data source and a unique ID for
the identity within its original data source. For example, an external ID
for an identity in the customer records for a bank might contain the bank
name (for the data source) and the account number (for the identity in the
data source), such as FirstCapital, 0123456789.
- external reference
- An additional identifier for an identity in a data source. For example,
an employee data source might use the employee serial number as the external
ID and the employee's social security number as the external reference. Often,
however, the external reference is set to the same value as the external ID,
because the additional identifier is not necessary to uniquely identify an
identity.
F
- full attribution
- An auditing feature of the entity database whereby specific details are
stored on where the identities come from. By accumulating this context, all
data in the entity database can be traced back to the original source system.
G
- GDA (general data acquisition)
- See UMF database conversion utility.
- general data acquisition (GDA)
- See UMF database conversion utility.
- generic value
- A data value that has occurred in the database for multiple entities a
specific number of times. For example, a telephone number with a value of
555-555-5555 might be considered a generic value after it occurs in the database
10 times. See also generic threshold.
- generic threshold
- The number of times that a data value can occur in the database for multiple
entities before that data value is considered a generic value. See also generic
value.
H
- hash
- An alphanumeric string that is generated from another value in order to
aid in the searching and comparing of values within the entity database.
I
- identity
- A collection of attributes from a data source that represent a person,
organization, place, or item.
- identity resolution
- See entity resolution.
L
- likeness score
- See resolution score.
M
- match merge rule
- See resolution rule.
N
- name standardization
- The process that normalizes names by reducing them to the most common
derivative or root name. For example, Richard is the most common derivative
or root name for Dick, Ricardo, Ricky, Rich, or Ritchie and Mohammad is the
most common derivative or root name for Mohamad, Mohamed, Mohammad, or Mohammed.
- NCE (net change engine)
- See net change utility.
- net change engine (NCE)
- See net change utility.
- net change utility
- A utility that compares fixed-width text file to a known data set and
either eliminates duplicate records between the incoming data and the known
data set or marks the records as add, change, or delete. This utility can
significantly reduce the number of records that are submitted to the pipelines
for processing.
- node
- In a network configuration, the physical machine that contains one or
more related functional units.
O
- output messages
- See UMF output documents.
P
- persistent search
- See attribute alert generator.
- pipeline
- The component that performs name standardization, data quality management,
address hygiene, and entity resolution. The pipeline also generates alerts,
based on the system configurations. See also name standardization, data
quality management, address hygiene, and entity
resolution.
- pipeline node
- The physical machine that contains one or more running pipeline processes.
See also pipeline.
- primary matching
- See candidate builder.
Q
- queue utility
- A utility that manages the transfer of data from a process or a file to
a queue in a queue manager, such as Microsoft Message Queuing or WebSphere
MQ.
- Qutil
- See queue utility.
R
- record
- The storage representation of a single row of a table or other data.
- relationship
- A link between two or more entities. A relationship is created based on
discovered data, disclosed data, or both. See also entity.
- relationship resolution
- See entity resolution.
- relationship score
- The value that is assigned during entity resolution as a result of applying
the resolution rules and that defines how closely the two compared identities
are related to each other. This score is fixed; that is, it is not adjusted
after the entity resolution process is completed. See also resolution
score.
- resolution rule
- The set of criteria that define how compared entities are resolved or
related.
- resolution score
- The value that is assigned during entity resolution as a result of the
confirmation and denial processing and that defines the likelihood that the
compared identities represent the same entity. This score is used to resolve
a new identity to an existing entity. See also relationship score.
- re-resolve
- The process of re-evaluating entities against existing entities, relationships,
or resolutions, and then resolving those entities appropriately.
- role
- A classification of an identity that defines the focus or purpose for
that identity. You can associate one or more roles with an identity.
- role alert
- An alert that identifies a single entity or two entities that contain
roles that the user has defined as of interest or as conflicting.
- role alert rules
- A user-configured rule that identifies one or more roles that cannot exist
in a single entity or cannot be linked between multiple entities.
- role code
- The unique identifier for a role. See also role.
- rule
- A set of conditional statements that enable computer systems to identify
relationships and run automated responses accordingly.
S
- service
- A program that performs a primary function within a server or related
software.
- SOAP
- A lightweight, XML-based protocol for exchanging information in a decentralized,
distributed environment. SOAP can be used to query and return information
and invoke services across the Internet. See also Web services.
T
- transliteration
- The process of changing characters that are represented in one alphabet
into the characters of another alphabet.
- transport
- A communication layer that allows the product to send and receive data
between the user data source and a pipeline. Examples of transports include
the HTTP transport, the queue transport, the database transport, or the file
transport, among others.
U
- UDDI
- See Universal Description, Discovery, and Integration.
- UMF (Universal Message Format)
- A standard markup language, based on XML, for structuring data source
files. To load data into the entity database, it must be in a universal message
format (UMF) file.
- UMF document
- The collection of UMF segments that structure the data.
- UMF elements
- XML tags and values that define the data within a UMF segment of a UMF
document.
- UMF record segment
- See UMF segment.
- UMF input document
- The collection of UMF segments that structure the incoming data to load,
modify, or query data in the entity database.
- UMF output document
- The collection of UMF segments that structure result data.
- UMF database conversion utility
- A utility that converts database files to Universal Message Format (UMF)
files. This utility is often customized for your specific database environment.
See also Universal Message Format.
- UMF file
- A file that contains one or more UMF documents. See also UMF document.
- UMF file conversion utility
- A utility that converts fixed-width text files to Universal Message Format
(UMF) files. See also Universal Message Format.
- UMF formatting utility
- A utility that formats Universal Message Format (UMF) files and extracts
UMF data to view the UMF records in wide format (one UMF record per line),
tall format (one UMF record across many lines), or for a specific UMF tag.
See also Universal Message Format.
- UMF message
- See UMF document.
- UMF segment
- The part of a UMF document that structures the data for the data source.
- Universal Description, Discovery, and Integration (UDDI)
- A set of standards-based specifications that enables companies and applications
to quickly and easily find and use Web services over the Internet. See also Web
services.
- unresolve
- The process of separating resolved identities into two separate entities.
V
- Visualizer
- The graphical user interface that analysts use to research alerts, view
relationships, search for entities, load data, and run reports.
W
- Web service
- A self-contained, self-describing modular application that can be published,
discovered, and invoked over a network using standard network protocols. Typically,
XML is used to tag the data, SOAP is used to transfer the data, WSDL is used
for describing the services available, and UDDI is used for listing what services
are available. See also SOAP, UDDI, and WSDL.
- WSDL (Web Services Description Language)
- An XML-based specification for describing networked services as a set
of endpoints operating on messages containing either document-oriented or
procedure-oriented information. See also Web services.
X
- Xutil
- See UMF formatting utility.