About full text indexes

Full text indexes contain the information used to find objects, by searching for embedded words or phrases. Full text indexes are also referred to as Verity Collections. Lists of Verity Collections are maintained in Index Areas.

Verity collection
A Verity Collection contains full-text indexing information, and is represented by a VerityCollection object in the object store database, and a set of files in a file system. These files are maintained by the Autonomy software and contain the actual full text information. Each Verity collection is a member of only one index area. The VerityCollection database object also associates this collection with the IndexArea and identifies the type of object being indexed (folder, annotation, and so on.)
Index area
An index area is associated with a particular object store and is represented by an IndexArea object in the object store database. The IndexArea object contains the path name of a file system directory that is used to store the files for the collection. The Content Search Engine queries and updates this directory as new indexable content is added to or removed from an object store.

Multiple index areas can be used to hold data for a single object store, but each index area can only hold data for one object store. Since an index area can have only one file system directory, multiple index areas are used for a single object store to spread the full-text indexing information across multiple file systems, either for geographic location or scalability purposes.

NOTE   If you have multiple index areas configured in an Object Store, to optimize performance, CE determines which index area to write the full text indexing information to, based on the following rules.

1. The Index Area on the same site as the Virtual Server.
2. The Index Area on the same site as the Object Store.
3. The Index Area on the same site as the Storage Area.

The status of an index area is either Open (accepting new indexing information), or Closed, Standby, or Full (not accepting new indexing information ). An index area can be created in "standby" mode, and will then be set to Open when the server accesses the index. Autonomy specific parameters that describe properties of the index area are also stored in the index area, such as: Autonomy Collections, Autonomy Search Servers, and Maximum Number of Collections.
Verity collection size
A single Verity Collection can store up to 8 million entries. An entry is put into a Verity Collection for each content element, plus an additional entry is put into the Verity Collection for all the CBR enabled string properties of an object (Document, Custom Object, and so on.) Each object store can have an unlimited number of Verity Collections, associated with an unlimited number of IndexAreas. The Verity Collections are created automatically as needed when new data is full text indexed.

You can reduce the index size by using a stop word list. Words included in this list are not indexed. Default stop word lists are provided with the content search engine. You can modify the excluded word list to match your site's needs.