Built-in Indexing Modes


Several built-in modes are predefined by the Verity engine, each of which is designed to support a different indexing behavior. For any one collection, the application can implement one or more built-in or custom indexing modes.

The indexing mode names are described below.

Predefined Indexing Mode Names

Mode Name
Description
generic
The generic mode is the base mode from which all other modes inherit their behaviors. It is optimized to give the average overall performance.
fastsearch
The fastsearch mode can be used to optimize indexes for fastest possible searching.
bulkload
The bulkload mode can be used to index large numbers of documents using the bulk modify/bulk update feature.
newsfeedidx
The newsfeedidx mode can be used to index documents arriving from a live feed, quickly and efficiently.
newsfeedopt
The newsfeedopt mode can be used to optimize collections that were indexed using the newsfeedidx mode.
readonly
The readonly mode can be used to disable modifications to indexes.
You can also define your own indexing mode. For more information, see the section, "Custom Indexing Modes" later in this chapter.

Generic Mode (generic)

The generic mode is the base mode from which all other modes inherit their behaviors. It is optimized to give the average overall performance without assuming anything about the desired indexing rates of documents, how many searches are occurring simultaneously, and so on.

The generic mode is not very efficient at performing any particular optimization in a short amount of time. It does not perform advanced search optimizations such as creating spanning word lists or squeezing deleted documents.

The Verity engine builds optimized VDBs for the generic mode. The generic indexing mode is equivalent to setting the following metaparameters:

typical_document_size
2000
document_throughput
60
document_latency
200
The generic mode (named "generic") is the default mode if no default mode is specified in the style.plc file for all applications or in VdkCollectionOpenArgRec for custom applications. Also, the generic mode is the default mode if the style.plc file does not exist at all.

Fast Search Mode (fastsearch)

The fastsearch mode is optimized to index documents so that retrievals happen as quickly as possible. This mode causes the Verity engine to do more work at indexing time.

The Verity engine performs the following optimizations for the fastsearch mode:

The fastsearch mode is equivalent to setting the following metaparameters:

typical_document_size
2000
document_throughput
60
document_latency
200

Bulk Load Mode (bulkload)

The bulkload mode is for indexing large numbers of documents in large batches with bulk modify/bulk update mechanism. It is primarily intended to create new collections from a large amount of pre-existing documents. The bulkload mode inherits most of its settings from the fastsearch mode.

The Verity engine performs the following optimizations for the bulkload mode:

The bulkload mode is equivalent to setting the following metaparameter:

typical_document_size
2000

News Feed Indexer Mode (newsfeedidx)

The newsfeedidx mode is optimized to accept a moderate number of documents in a short amount of time where the documents arrive in frequent small batches. It is designed keep up with the high arrival rates of news feeds without falling behind in the indexing.

Designed to index incoming documents and perform small merges for partitions of up to 100 documents each. These small partitions are not optimized VDBs, since optimization of such small partitions would incur significant overhead.

The newsfeedidx mode sets the following metaparameters:

typical_document_size
2000
document_throughput
1000
document_latency
60
If you are developing an indexing application using the VDK API, the following service levels must be set for the session with the newsfeedidx mode: VdkServiceLevel_Index, VdkServiceLevel_Optimize.

News Feed Optimizer Mode (newsfeedopt)

The newsfeedopt mode is designed to perform background work that the newsfeedidx mode does not. Both modes are designed to be used together.

What the newsfeedopt mode does is merge partitions (components of collections) that the newsfeedidx mode creates into large and optimized partitions. This mode ensures fast search performance by:

The Verity engine performs the following optimizations for the newsfeedopt mode:

If you are developing an indexing application using the Verity Developer's Kit, the following service levels must be set for the session with the newsfeedopt mode: VdkServiceLevel_Optimize, VdkServiceLevel_DBA, VdkServiceLevel_Delete.

Read Only Mode (readonly)

The readonly mode is not an indexing mode in the sense that it does not affect how indexing occurs. It disables data writes to the collection. This mode is useful for accessing a collection on a read-only medium such as CD-ROM.





Copyright © 2002, Verity, Inc. All rights reserved.