Tuning Collection Index Contents


The Verity engine indexes documents using configuration options specified in style files. A directory of default style files is included with each Verity product.

The sections below summarize the style file options available for index tuning.

Style Files Affecting Collection Index Contents

The configuration options specified in the style files listed below affect how word indexes and feature vectors (for clustering and summarization) are generated during the indexing process.

Style File Name
Function
style.lex
Specifies that nonalphanumeric characters be used as search criteria. The specified characters can be interpreted as legal characters so that words containing the specified nonalphanumeric, like OS/2, will appear as index entries.
style.stp
Specifies an excluded word list. This list excludes selected words from the collection word index-freeing the index of unwanted words.
style.go
Specifies an included word list. This list includes only selected words in the collection word index-limiting the word index to a narrow vocabulary.
style.ufl
Specifies collection fields for which field indexes will be built. By implementing an /indexed or /minmax field index, the Verity engine indexes collection field values so that a search can search the field values more efficiently.
style.prm
Specifies additional data to be stored in the index, including: SOUNDEX data, assist data, and highlight data
style.fxs
Specifies words to exclude from feature vectors so the words do not appear in document summaries and clusters.

Setting Up the Collection style Directory

The Verity engine assigns a set of style files to a collection when it is first created. If not specified at collection creation time, a default set of style files is used to configure the collection contents.

If you want to tune the collection index contents, you need to create and/or edit the style files in the new collection_name/vdkstyle directory before indexing documents for the collection.

NOTE: Do not edit the style.ddd file.





Copyright © 2002, Verity, Inc. All rights reserved.