Documentation
(C) IBM Corp. 1996, 1999

Text Extender: Administration and Programming

ENABLE TEXT COLUMN

This command enables a text column for use by Text Extender.

Authorization

You must have at least one of the following for the table:

ALTER privilege

SELECT privilege

UPDATE privilege.

Command syntax

>>-ENABLE TEXT COLUMN--table-name--text-column-name------------->
 
>-----+--------------------------+--HANDLE--handle-column-name-->
      '-FUNCTION--function-name--'
 
>-----+-----------------------+--------------------------------->
      '-| text-information |--'
 
>-----+----------------------------+---------------------------->
      '-| index-characteristics |--'
 
>-----+----------------------------+---------------------------->
      '-UPDATEINDEX--+-UPDATE---+--'
                     '-NOUPDATE-'
 
>-----+------------------------------+-------------------------->
      '-TABLESPACE--tablespace-name--'
 
>-----+---------------------+----------------------------------><
      '-COMMITCOUNT--count--'
 
text-information
 
|---+---------------+---+---------------------+----------------->
    '-CCSID--ccsid--'   '-LANGUAGE--language--'
 
>-----+-----------------+---------------------------------------|
      '-FORMAT--format--'
 
index-characteristics
 
|---+-----------------------------------------------------------------+->
    '-INDEXTYPE--+-PRECISE----+---+--------------------------------+--'
                 +-LINGUISTIC-+   '-INDEXOPTION--+-CASE_ENABLED-+--'
                 +-DUAL-------+                  '-NORMALIZED---'
                 '-NGRAM------'
 
>----+--------------------------------------------------------------------------------+->
     |                                                          .-,------------.      |
     |                                                          V              |      |
     '-INDEXPROPERTY--SECTIONS_ENABLED--+-DOCUMENTMODEL--+---(-----modelname---+---)--'
                                        '-DOCUMENTMODELS-'
 
>----+----------------------------------------+----------------->
     '-UPDATEFREQ--+-NONE------------------+--'
                   '-| update-frequency |--'
 
>-----+-------------------------------------------------------------------------------+>
      +-DIRECTORY--directory----------------------------------------------------------+
      |  .--------------------------------------------------------------------------. |
      |  |                                          .-,---------------------.       | |
      |  V                                          V                       |       | |
      '----DIRECTORY--directory--ON--+-NODE--+---(-----node--+-----------+--+---)---+-'
                                     '-NODES-'               '-TO--node--'
 
>---------------------------------------------------------------|
 
update-frequency
 
|---MIN--(--mindocs--)--D---(--+-*------------+---)---H--------->
                               |  .-,-------. |
                               |  V         | |
                               '----0...6---+-'
 
                                       .-,---------.
                                       V           |
>----(--+-*-------------+---)---M---(-----0...59---+---)--------|
        |  .-,--------. |
        |  V          | |
        '----0...23---+-'
 

Command parameters

table-name
The name of the text table in the connected database that contains the column to be enabled. The name must include an explicit schema name (qualifier) unless the schema name is the same as your user ID.

text-column-name
The name of the column to be enabled for use by Text Extender. This column must be of the type CHAR, VARCHAR, LONG VARCHAR, CLOB, DBCLOB, GRAPHIC, VARGRAPHIC, or LONG VARGRAPHIC. If the document type is not one of these, use FUNCTION to convert the document type.

FUNCTION

function-name

The name of a user-defined function to be used by the Text Extender library services to access text documents that are in a column that is not of type CHAR, VARCHAR, LONG VARCHAR, CLOB, DBCLOB, GRAPHIC, VARGRAPHIC, or LONG VARGRAPHIC. See Enabling text columns of a nonsupported data type for more information.

HANDLE handle-column-name
The name of the handle column to be added to the table for use by Text Extender's UDFs.

CCSID ccsid
The Coded Character Set Identifier to be used when indexing text documents.

For an Ngram index, the CCSID must be the same as the CCSID of the database. To find the default CCSID, use:

db2tx get text cfg

The installation default is the database CCSID.

If this keyword is not specified, the CCSID specified in the text configuration settings is used. Subsequent changes to the text configuration settings are ignored; the value used is the one that existed at the time the column was enabled, not the one that exists when indexing text documents.

For information about other CCSIDs that can be supported, see CCSIDs.

LANGUAGE

language

The language in which the text is written. This determines which dictionary is to be used when indexing text documents and when searching in text documents. Chapter 3, Linguistic processing describes how dictionaries are used.

This keyword specifies the language once for the whole column. You can override this value for individually inserted text documents using the INIT_TEXT_HANDLE function in an INSERT statement.

If this keyword is not specified, the language specified in the text configuration settings is used. Subsequent changes to the text configuration settings are ignored; the value used is the one that existed at the time the column was enabled, not the one that exists when indexing text documents.

The supported languages are listed in Languages.

FORMAT

format

The type of text document stored, such as WordPerfect, or ASCII. Text Extender needs this information when indexing documents. The document formats supported are listed in Formats.

The document formats supported for structured documents are:

For these formats, you must specify the structure information in a document model file. See Working with structured documents. If the format TDS and INDEXPROPERTY SECTION_ENABLED are specified, it is assumed that the document format is ASCII_SECTIONS.

Tags that are not defined in the model file are indexed in the normal way, according to the index type.

This keyword specifies the format once for the whole column. You can override this value for individually inserted text documents using the INIT_TEXT_HANDLE function in an INSERT statement.

If this keyword is not specified, the format specified in the text configuration settings is used. Subsequent changes to the text configuration settings are ignored; the value used is the one that existed at the time the column was enabled, not the one that exists when indexing text documents.

INDEXTYPE
The type of index to be created. For more information, see Types of index.

PRECISE
Terms are indexed and searched for exactly as they occur in the text documents.

LINGUISTIC
Terms are processed linguistically before being indexed. Search terms are also processed linguistically before the search begins.

DUAL
Terms are indexed exactly as they occur in the text documents, and they are also indexed after being processed linguistically. When searching, you can decide for each term whether to search for the precise term or for the linguistically processed term.

NGRAM
Terms are indexed by parsing sets of characters rather than by using a dictionary. This index type is mandatory if the documents you are indexing contain DBCS characters, although an Ngram index can also be used for SBCS documents.

If you do not specify the INDEXTYPE keyword, the value in the text configuration settings is used.

The dual index type cannot be used to take advantage of document structure.

Documents in XML format are not supported for Ngram indexes.

INDEXOPTION
Options to be used when creating the index.

CASE_ENABLED
This option is available only for Ngram indexes. Normally, Ngram indexes do not allow a case-sensitive search. By specifying CASE_ENABLED, you ensure that documents are indexed such that a case-sensitive search is possible. For more information see Ngram index.

NORMALIZED
This option is available only for precise indexes. A normalized precise index differs from a precise index in that:

  • It is not case-sensitive; all words except those in all uppercase are converted to lowercase.

  • Words in all uppercase are not subject to stop-word filtering; the abbreviation UK, for example, is indexed.

  • English language search terms may be expanded to include lemma forms using a heuristic algorithm, so that a search for house also searches for houses.

INDEXPROPERTY SECTIONS_ENABLED DOCUMENTMODEL(S) model-name
Properties of a selected index type.

SECTIONS_ENABLED specifies that the selected index type can contain information about the document structure.

DOCUMENTMODEL/DOCUMENTMODELS model-name specifies the model or models to be associated as default for the documents to be indexed. A model name must be specified if the index property SECTIONS_ENABLED is used. If a list of models is specified, the first model is used as the default model for the index. The default model is used during indexing if the document has no reference to a model, or if no model is specified during search.

The characters that can be used for the model name are a-z, A-Z, and 0-9.

The specified model name must correspond to a model definition in the model definition file desmodel.ini.

To change the model or models associated with an index,

  1. Use DISABLE TEXT COLUMN to disable the index

  2. Use ENABLE TEXT COLUMN to reindex the documents, specifying different document model names.

UPDATEFREQ update-frequency
The index update frequency in terms of when the update is to be made, and the minimum number of text documents that must be queued in the log table. If there are not enough text documents in the log table at the day and time given, the index is not updated.

The syntax is described in Setting the frequency of index updates.

If you do not specify UPDATEFREQ, the default frequency specified in the text configuration settings is used.
Tip

If you have many tables, consider avoiding the use of the default values. By making individual update frequency settings for tables you can avoid indexing all the tables simultaneously and causing an unnecessarily prolonged load on your system resources.

NONE
No further index updates are made. This is intended for a text column in which there will be no further changes.

These update frequency settings are ignored if they have already been set for the whole table by ENABLE TEXT TABLE.

DIRECTORY directory
The directory path in which the text index is to be stored. The specified path is concatenated with ""txinsnnn" where nnn is the node number.

This is an existing directory on the system where the Text Extender server is running.

If you do not specify the DIRECTORY keyword, the value of the DIRECTORY setting in the text configuration settings is used.

This setting is ignored if it has already been set for the whole table by ENABLE TEXT TABLE.

If you are using partitioned databases, you can specify one common index path name for all nodes, or for a range of nodes, or different index directory paths for each node.

ON NODE node [TO node]
The number of the node or the range of nodes to which a directory path name is being assigned.

UPDATEINDEX
A keyword that determines whether the text documents associated with this handle column are indexed immediately after this command has completed, without waiting for the next periodic indexing set by UPDATEFREQ.

UPDATE
Indexing of the text documents occurs immediately after this command has completed.

NOUPDATE
Indexing occurs at a time set by the update frequency settings specified either in this command by UPDATEFREQ, or by the text configuration setting.

If you do not specify this keyword, the value in the text configuration settings is taken.

COMMITCOUNT count
A value from 500 to 1000000 indicating the number of inserts or updates after which DB2 UDB must issue an intermediate commit statement. This can avoid a situation in which there is insufficient log space when enabling large tables, or columns, or a large number of external files.

Usage notes

This command adds a handle column to the specified DB2 table. Each handle column is associated with a text column, and is used by Text Extender's UDFs.

If this table has not already been enabled to create a common index, an index is created that is associated with this text column.

Also, a log table is created in the database. The log table is used to record changes to the text column, that is inserts, updates, and deletions. Insert, update, and delete triggers are defined for the text column to keep the log table up to date automatically.

If the text column that you are enabling belongs to a table that is part of a multiple-node nodegroup, the index directory that you specify must be available on all physical nodes. If you use the default directory specified in the text configuration, make sure that the path is available on all nodes of the nodegroup. If this is not convenient, you can specify a specific path for each node in the ENABLE TEXT COLUMN command.

If you change the node configuration of a nodegroup that contains a table that is enabled for Text Extender, you must reindex the table.
Tip

If you run out of log space in this step, see Enabling a text column in a large table for possible solutions.


[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]