About index jobs
Index
An index job rebuilds the full text index in the event that it becomes corrupted, or a configuration change requires that the full text index be rebuilt.
Index jobs are executed from Enterprise Manager. No indexing data is unavailable while indexing is in process.
An index job supports three functions:
- Class indexing
- Class Indexing refers to a class that has properties or content that’s enabled or disabled for indexing. Objects with newly enabled properties/classes are added to the index, and objects with newly disabled properties/classes are removed from the index. This type of indexing is implemented by specifying a class of objects to be indexed. The class could be a Document, Folder, or a subclass of any base class that can be indexed.
Indexing a class must be done after the CBR enabled status of a class or property is changed. If a class has CBR enabled turned on and the class is indexed, all instances of that class are full text indexed. Likewise, if the properties to be indexed on a class are changed, and the class is indexed, the index information is updated to include any newly enabled properties and remove any newly disabled properties from the index. Finally, if a class that was formerly CBR enabled is now disabled and an index of that class is done, instances of that class are removed from the indexing information.
NOTE Do not change the CBR enabled status of a class or property while an index job is running on that class. Doing so will cause unpredictable results. Abort any index job running on a class before changing the configuration, and then resubmit the index job on that class.
If the class selected to index is a base class such as Document, Annotation, Folder, or Custom Object; then new Verity collections are created to hold the new indexing information. When the index operation completes, the old collections that previously held the indexing information are deleted. Thus, indexing a base class is somewhat faster than indexing a subclass of the base class, because the deletes of the indexing information are not done on an individual object basis, but instead are done with the delete of the entire Verity collection(s) at the end of the index job.
- Collection indexing
- Collection indexing is performed when you want to re-index everything that is in a collection, such as when a collection becomes corrupted or is lost due to a disk drive failure.
Collection indexing is accomplished via creation of new full text index data, and then deletion of current indexing data. This type of indexing is implemented by specifying one or more Autonomy Collections to be indexed. A single Autonomy Collection may be re-indexed if just one becomes corrupted, or all Verity Collections for an Index Area may be re-indexed if the entire directory holding collections for the Index Area is lost.
NOTE Do not use Collection Indexing if you change the CBR Enabled flag on a class or property. Even if all collections of an object store are selected, only the data in the collection is submitted for indexing. Any classes newly enabled for indexing are not re-indexed because instances of these classes are not in the existing collections.
- Single item indexing
-
Individual object indexing is performed when you want to rebuild the indexing information for a particular object that is a Document, Folder, Annotation, or Custom Object due to a single failure on the original index attempt, or a need to change some aspect of the configuration (such as a Autonomy style file) and then re-index.
- Performing this operation on an object whose class is CBR enabled will re-attempt to index the object.
- Performing this operation on an object whose class is not CBR enabled will attempt to delete this object from any existing indexes.
Most index jobs require sweeping the object store database for collections or classes to be indexed. This sweep requires a table scan on the database, so it may take considerable time on a large table (even if the amount of data to be indexed is small). Each index job performs a table scan on the database once for each table that needs to be inspected for all classes to be indexed in that table, and once for all collections that need to be indexed in a table. Therefore, it is recommended that you put all classes or collections to be indexed for the same table into a single index job to minimize the number of table scans that are performed.