Index jobs

An index job rebuilds the full-text index used in content-based retrieval (CBR). An index might need to be rebuilt if it becomes corrupted or if a configuration change requires that the index be rebuilt. Index jobs are executed from Enterprise Manager. Searches and indexing proceed normally while index jobs are in progress, but searches might yield incomplete results.

Most index jobs require sweeping the object store database for collections or classes to be indexed. This sweep requires a table scan on the database, so it can take considerable time on a large table (even if the amount of data to be indexed is small). Each index job performs a table scan on the database once for each table that needs to be inspected for all classes to be indexed in that table, and once for all collections that need to be indexed in a table. Therefore, it is recommended that you put all classes or collections to be indexed for the same table into a single index job to minimize the number of table scans that are performed.

An index job supports three functions:

Class indexing
Class indexing refers to a class that has properties or content that is enabled or disabled for indexing. Objects with newly enabled properties or classes are added to the index, and objects with newly disabled properties or classes are removed from the index. This type of indexing is implemented by specifying a class of objects to be indexed. The class could be a document, folder, or a subclass of any base class that can be indexed.

Indexing a class must be done after the CBR-enabled status of a class or property is changed. If CBR is enabled for a class and the class is indexed, all instances of that class are full-text indexed. Likewise, if the properties to be indexed on a class are changed, and the class is indexed, the index information is updated to include any newly enabled properties and remove any newly disabled properties from the index. Finally, if you disable CBR for a class that was formerly enabled and an index of that class is done, instances of that class are removed from the indexing information.

NOTE   Do not enable or disable CBR on a class or property while an index job is running on that class. Doing so will cause unpredictable results. Stop any index job running on a class before changing the configuration, and then resubmit the index job on that class.

If the class selected to index is a base class such as Document, Annotation, Folder, or Custom Object, new Verity collections are created to hold the new indexing information. When the indexing completes, the old collections that previously held the indexing information are deleted. Thus, indexing a base class is somewhat faster than indexing a subclass of the base class, because the deletion of the indexing information is not done on an individual object basis, but instead is done with the deletion of the entire Verity collection at the end of the index job.

Collection indexing
Collection indexing is performed when you want to reindex everything that is in a collection, such as when a collection becomes corrupted or is lost due to a disk drive failure.

Collection indexing is accomplished by creating new full-text index data, and then deleting the current indexing data. This type of indexing is implemented by specifying one or more Autonomy (also called Verity) collections to be indexed. A single Autonomy collection can be reindexed if just one becomes corrupted, or all Autonomy collections for an index area can be reindexed if the entire directory holding collections for the index area is lost.

Do not use collection indexing to rebuild an index when you enable or disable CBR on a class or property. Even if all collections of an object store are selected, only the data in the collection is submitted for indexing. Any classes newly enabled for indexing are not reindexed because instances of these classes are not in the existing collections.
Single item indexing
Individual object indexing is performed when you want to rebuild the indexing information for a particular object that is a Document, Folder, Annotation, or Custom Object due to a single failure on the original index attempt, or a need to change some aspect of the configuration (such as an Autonomy style file).