(C) IBM Corp. 1996, 1999
Text Extender: Administration and Programming
This glossary defines many of the terms and abbreviations used in this
manual. If you do not find the term you are looking for, refer to the
index or to the Dictionary of Computing, New York: McGraw-Hill, 1994.
- A
- access function
- A user-provided function that converts the data type of text stored in a
column to a type that can be processed by Text Extender.
- administration
- The task of preparing text documents for searching, maintaining indexes,
and getting status information.
- API
- Application programming interface.
- application programming interface (API)
- A general-purpose interface between application programs and the Text
Extender information retrieval services.
- B
- Boolean search
- A search in which one or more search terms are combined using Boolean
operators.
- bound search
- A search in Korean documents that respects word boundaries.
- browse
- To view text displayed on a computer monitor.
- browser
- A Text Extender function that enables you to display text on a computer
monitor.
- C
- catalog view
- A view of a system table created by Text Extender for administration
purposes. A catalog view contains information about the tables and
columns that have been enabled for use by Text Extender.
- CCSID
- Coded Character Set Identifier.
- code page
- An assignment of graphic characters and control function meanings to all
code points. For example, assignment of characters and meanings to 256
code points for an 8-bit code.
- command line processor
- A program called DB2TX that:
Allows you to enter Text Extender commands
Processes the commands
Displays the result.
- common-index table
- A DB2 table whose text columns share a common text index. See also
multi-index table.
- count
- A keyword used to specify the number of levels (the depth) of terms in the
thesaurus that are to be used to expand the search term for the given
relation.
- D
- data stream
- Information returned by an API function, comprising text (at least one
paragraph) containing the term searched for, and information for highlighting
the found term in that text.
- DB2 Extender
- One of a group of programs that let you store and retrieve data types
beyond the traditional numeric and character data, such as image, audio, and
video data, and complex documents.
- DBCS
- Double-byte character support.
- dictionary
- A collection of language-related linguistic information that Text Extender
uses during text analysis, indexing, retrieval, and highlighting of documents
in a particular language.
- disable
- To restore a database , a text table, or a text column, to its condition
before it was enabled for Text Extender by removing the items created during
the enabling process.
- distinct type
- See user-defined distinct type.
- document
- See text document.
- document handle
- See handle.
- document model
- The definition of the structure of a document in terms of the sections
that it contains. A document model makes Text Extender aware of the
sections within documents when indexing. A document model lists the
markup tags that identify the sections. For each tag you can specify a
descriptive section name for use in queries against that section. You
can specify one or more document models in a document model file.
- dual index
- A text index having the characteristics of a precise
index and a linguistic index. See also Ngram
index.
- E
- enable
- To prepare a database , a text table, or a text column, for use by Text
Extender.
- environment variable
- A variable used to provide defaults for values for the Text Extender
environment.
- environment profile
- A script provided with Text Extender containing settings for
environment variables.
- escape character
- A character indicating that the subsequent character is not to be
interpreted as a masking character.
- expand
- The action of adding to a search term additional terms derived from a
thesaurus.
- extended matching
- A process involving the use of a dictionary to highlight terms
that are not obvious matches of the search term.
- extender
- See DB2 Extender.
- external file
- A text document in the form of a file stored in the operating system's
file system, rather than in the form of a cell in a table under the control of
DB2.
- F
- feature search
- A search for terms such as names of people, places, or organizations, made
in a linguistic index created using the FEATURE_EXTRACTION indexing
option.
- file handle
- See handle.
- format
- The type of a document, such as ASCII, or WordPerfect.
- free-text search
- A search in which the search term is expressed as free-form text - a
phrase or a sentence describing in natural language the subject to be searched
for.
- function
- See access function.
- fuzzy search
- A search that can find words whose spelling is similar to that of the
search term.
- H
- handle
- A binary value that identifies a text document. It includes:
A document ID
The name and location of the associated index
The document's text information
If the document is located in an external file not under the control of
DB2, the path and name of the file.
A handle is created for each text document in a text column when that
column is enabled for use by Text Extender.
- highlighting information
- See data stream.
- hybrid search
- A combined Boolean search and free-text
search.
- I
- index
- To extract significant terms from text, and store them in a text
index.
- index characteristics
- Properties of a text index determining:
The directory where the index is stored
The index type
The frequency with which the index is updated
When the first index update is to occur.
- index type
- A characteristic of a text index determining whether it
contains exact or linguistic forms of document terms, or both. See
precise index, linguistic index, dual index,
and Ngram index.
- initialized handle
- A handle, prepared in advance, containing only the text format,
or the text language, or both.
- instance
- A logical Text Extender environment. You can have several instances
of Text Extender on the same workstation, but only one instance for each DB2
instance. You can use these instances to:
Separate the development environment from the production environment
Restrict sensitive information to a particular group of people.
- instance variable
- A variable used to provide a default value for the name of the
instance owner, or the name of the instance owner's home
directory.
- L
- language
- The name of a dictionary to be used when indexing,
searching and browsing.
- linguistic index
- A text index containing terms that have been reduced to their
base form by linguistic processing. "Mice", for example, would
be indexed as "mouse". See also precise index,
Ngram index, and dual index.
- logical node
- A node assigned with other nodes to the same physical
machine. See also physical node.
- log table
- A table created by Text Extender containing information about which text
documents are to be indexed. Triggers are used to store this
information in a log table whenever a document in an enabled text column is
added, changed, or deleted.
- M
- masking character
- A character used to represent optional characters at the front, middle,
and end of a search term. Masking characters are normally used for
finding variations of a term in a precise index.
- match
- The occurrence of a search term in a text document.
- multi-index table
- A DB2 table whose text columns have individual text
indexes. See also common-index table.
- N
- Ngram index
- A text index that supports DBCS documents and fuzzy search of
SBCS documents. See also linguistic index precise
index and dual index.
- node
- A server in a partitioned database environment. See also
logical node, physical node, and
nodegroup.
- nodegroup
- A named subset of one or more database partition servers.
node assigned to a physically separate machine. See also
logical node.
- O
- occurrence
- Synonym for match.
- P
- partitioned database
- A database consisting of several parts, each of which is maintained by a
separate database partition server.
- periodic indexing
- Indexing at predetermined time intervals, expressed in terms of the day,
hour, and minute, and the minimum number of documents names that must be
listed in the log table for indexing, before indexing can take
place.
- physical node
- A node assigned to a physically separate machine. See
also logical node.
- precise index
- A text index containing terms exactly as they occur in the text
document from which they were extracted. See also linguistic
index Ngram index and dual index.
- profile
- See environment profile.
- R
- rank
- An absolute value of type DOUBLE between 0 and 1 that indicates how well a
document meets the search criteria relative to the other found
documents. The value indicates the number of matches found in the
document in relation to the document's size.
- refine
- To add the search criteria from a previous search to other search criteria
to reduce the number of matches.
- retrieve
- To find a text document using a search argument in one of Text
Extender's search functions.
- S
- SBCS
- Single-byte character support.
- search argument
- The conditions specified when making a search, consisting of one or
several search terms, and search parameters.
- shell profile
- See environment profile.
- stop word
- A common word, such as "before", in a text document
that is to be excluded from the text index, and ignored if included
in a search argument.
- T
- text column
- A column containing text documents.
- text configuration
- Default settings for index, text, and processing values.
- text document
- Text of type CHAR, GRAPHIC, VARGRAPHIC, LONG VARGRAPHIC, DBCLOB, VARCHAR,
LONG VARCHAR, or CLOB, stored in a DB2 table.
- text index
- A collection of significant terms extracted from text documents.
Each term is associated with the document from which it was extracted.
A significant improvement in search time is achieved by searching in the index
rather than in the documents themselves. See also precise
index, linguistic index, and dual index.
- text information
- Properties of a text document describing:
The CCSID
The format
The language.
- text table
- A DB2 table containing text columns.
- tracing
- The action of storing information in a file that can later be used in
finding the cause of an error.
- trigger
- A mechanism that automatically adds information about documents that need
to be indexed to a log table whenever a document is added, changed,
or deleted from a text column.
- U
- UDF
- User-defined function.
- UDT
- User-defined distinct type.
- update frequency
- The frequency with which a text index is updated, expressed in terms of
the day, hour, and minute, and the minimum number of document names that must
be listed in the log table for indexing, before indexing can take
place.
- user-defined distinct type (UDT)
- A data type created by a user of DB2, in contrast to a data type provided
by DB2 such as LONG VARCHAR.
- user-defined function (UDF)
- An SQL function created by a user of DB2, in contrast to an SQL function
provided by DB2. Text Extender provides search functions, such as
CONTAINS, in the form of UDFs.
- W
- wildcard character
- See masking character.
[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]