Using the style.stp File


In a style.stp file, you specify the words to be excluded from a collection's word indexes. A style.stp file is optional. Words that frequently occur in documents such as articles, simple nouns, and verbs can be excluded so that retrievals are performed faster. Excluding words from a word index will affect the behavior of the Verity engine.

The style.stp file is rarely used. Its main purpose is for excluding rare constructs that look like words in documents (such as the 70-character "words" starting with M found in encoded files).

style.stp Syntax Reference

A style.stp file is a flat ASCII file containing an excluded word list. The words in the list can appear in any order. The excluded word list should be left justified, and a separate word should appear on each line. A sample style.stp file is shown below.


[0-9a-zA-z]
..........+
an
and
the
of
or
but

style.stp Features

When creating a style.stp file, remember to consider the following:

Case Sensitivity

By default, the collection word index is case-sensitive. If your collection word indexes are case-sensitive, the style.stp file must include every case combination for words you want to stop. For example, if you want to stop both "and" and "And", you must include both entries in your style.stp file.

Regular Expressions

You can specify a regular expression as a word in the excluded word list. For example, the following regular expression could be entered:

[0-9a-zA-Z]

This regular expression excludes every one-letter word appearing in a collection's documents from appearing in the collection word indexes.

You can also use regular expressions to exclude long words from a word indexes. Long words generally occur infrequently. Thus, users are less likely to search for them, and it is usually not crucial for them to be indexed.

To exclude words that are n characters or more in length, enter a regular expression consisting of n dots (.) followed by a plus sign (+) in your style.stp file. For example, to exclude words of 10 or more characters, enter 10 dots followed by a plus sign in your style.stp file, as follows:

..........+





Copyright © 2002, Verity, Inc. All rights reserved.