Field Definition Files


The field definition files define the collection schema. These are the style files used for field definition:

Internal fields are defined in the default style.ddd file. This file controls the collection's schema and uses a $include statement to include and load the style.ufl, style.sfl and style.prm files.

NOTE: You should never manually change the contents of the style.ddd file.

Custom user fields (custom application fields) are defined in the style.ufl file. This file is included by the style.xfl file.

Standard fields are defined in the default style.sfl file. This file is included by the style.xfl file.

Internal Fields-In the style.ddd File

The style.ddd file contains definitions for all internal fields included in the default schema for the collection's internal documents table. This file includes a $include statement which refers to the style.xfl that in turn includes the style.sfl and style.ufl files.

style.ddd contents

NOTE: The style.ddd file should not be edited. The default style.ddd file for the File System gateway is shown below.


#
# Document Dataset Descriptor
#
# DO NOT add user fields to this file - add them to style.ufl
# which is included at the end of this file.
$control: 1
$include style.prm
$subst: 1
descriptor:
/collection = yes
{
# Header information for partition management
data-table: _df
/num-records = 1
/max-records = 1
{
worm: _DBVERSION text
fixwidth: _DDDSTAMP 4 date
varwidth: _DOCIDX _dv
fixwidth: _DOCIDX_OF 4 unsigned-integer
fixwidth: _DOCIDX_SZ 3 unsigned-integer
fixwidth: _PARTDESC 32 text
constant: _FtrCfg text "${DOC-FEATURES:}"
constant: _SumCfg text "${DOC-SUMMARIES:}"
fixwidth: _SPARE1 16 text
fixwidth: _SPARE2 4 signed-integer
}
# Required internal fields per document
data-table: _df
/offset = 64
{
autoval: _STYLE sirepath
fixwidth: _DOCID 4 unsigned-integer
fixwidth: _PARENTID 4 unsigned-integer
/_minmax-nonzero = yes
fixwidth: _SECURITY 4 unsigned-integer
/minmax = yes
fixwidth: _INDEX_DATE 4 date
/_minmax-nonzero = yes
}
$ifdef DOC-FEATURES
# Optional feature vector per document
data-table: _dg
{
varwidth: VDKFEATURES _dh
/_implied_size
}
$endif
$ifdef DOC-SUMMARIES
# Optional generated summary per document
data-table: _di
{
varwidth: VDKSUMMARY _dj
/_implied_size
}
$endif
data-table: _dk
{
dispatch: DOC
varwidth: DOC_FN _dl
}
# ---------------------------------------------------------------
# The VdkVgwKey is the application's primary key to identify
# each document in the Document Data Table. By default, the
# VdkVgwKey is a text string no more than 256 bytes (VdkDocKey_MaxSize)
# in length. It is stored in a separate data-table, indexed and
# minmaxxed to minimize the time required to lookup by VdkVgwKey.
data-table: aaa
{
varwidth: VdkVgwKey aab
/indexed = yes
/minmax = yes
}
# ---------------------------------------------------------------
# All extensions the the DDD schema are included via style.xfl
# This includes TIS Standard fields, User defined fields and
# gateway specific fields.
$include style.xfl
}
$$

Standard Fields-In the style.sfl File

All collections have the following fields defined in the internal documents table by default because they are defined in the default style.sfl file (standard fields file). Standard fields are populated by the filters and gateways used. Not all filters and gateways populate all standard fields so that it is possible that some standard fields will be defined but not populated during indexing.

Field Name
Description
Title
Title of the document
Author
Author of the document
Keywords
Keywords in the document
MIME-Type
The mime type of the document. This field is populated by the universal filter.
Charset
The character set used in the document
To
To field in the document, such as the value of "To:" in an e-mail.
Date
Last date the document was modified
Newsgroups
News groups (populated only by the news zone filter)
PageMap
A vector of integers, one for each page, describing the number of word instances for each page (populated only by the PDF filter)
The default style.sfl file includes many field definitions which can be defined and populated as standard fields. If all of these fields were uncommented, that would create a very large document index, so they are left commented for now. If you need any of them, you can uncomment them before creating your collection. For example, if you would like to search over a PDF field in PDF documents, you should uncomment the desired fields in the style.sfl file.

For complete information about the universal filter and its configuration, refer to "The Universal Filter" in Chapter 6.

Field Aliases in style.sfl

The style.sfl file uses field aliases to alias field names from various filters to a set of common field names. For example, in the default style.sfl file the PDF field name "FTS_Title" is an alias for the collection field named "Title." The field aliasing feature simplifies field display and gives users flexibility to use field names they are familiar with in their queries. This allows you to perform field searches using a field name or an aliased field name. For example, by default you can perform field searches using the field name "FTS_Title" or "Title". To override this default mapping, you can create a custom field in the style.ufl file. This file is described in "Custom User Fields-In the style.ufl File."

style.sfl Contents

The style.sfl file is referenced by the style.xfl file, which is in turn referenced by the style.ddd file. Do not edit the style.xfl or style.ddd files; instead, change the state of fields here in the style.sfl file, or add your custom fields to the style.ufl file.

The default style.sfl file for the File System gateway is shown below.


# style.sfl - Verity-Defined Standard Fields
#
# These fields are included in the internal documents table.
# They are filled in by various filters and gateways that Verity
# ships, and are the "standard fields" that Verity suggests should
# exist in all Verity collections. They are not required in your
# collection. Instead, they are merely highly recommended to
# promote the ability to use your collection with other products
# that use Verity's search technology. You can comment out the
# fields below to save space, or uncomment others to gain
# functionality.
#
data-table: _sf
{
# Title is filled in by: zone -html, flt_pdf, flt_kv
varwidth: Title _sv
/alias = FTS_Title
# Subject is filled in by: flt_pdf, zone -email, zone -news,
# flt_kv
varwidth: Subject _sv
/alias = FTS_Subject
# Author is filled in by: flt_pdf, zone -email, zone -news,
# flt_kv
varwidth: Author _sv
/alias = From
/alias = FTS_Author
/alias = Source
# Keywords is filled in by: flt_pdf, zone -news, flt_kv
varwidth: Keywords _sv
/alias = FTS_Keywords
/alias = Keyword
/alias = Reference
# Snippet is filled in by: vgw_url
varwidth: Snippet _sv
/alias = Abstract
# MIME-Type is filled in by: universal
varwidth: MIME-Type _sv
# Charset is filled in by: flt_cmap
varwidth: Charset _sv
# To is filled in by: zone -email
varwidth: To _sv
/alias = Destination
# Date is filled in by: zone -email, zone -news, flt_pdf, flt_kv
#
# This field is the "last modified" date, not the creation date
fixwidth: Date 4 date
/alias = Modified
/alias = FTS_ModificationDate
/alias = Recorded_Date
/alias = Version_Date
# NewsGroups is filled in by: zone -news
varwidth: NewsGroups _sv
# PageMap is filled in by: flt_pdf
# This field is required to do highlighting in pdf documents.
# Do not comment this if you want pdf highlighting!
varwidth: PageMap _sv
/_hexdata = yes
#
# The following are fields that could be filled in by Verity
# shipped code, but they are currently not populated by many
# documents. To save space, they are commented out here. You may
# comment them back in before indexing your documents if you need
# them in your collection.
# The following fields are filled in by "zone -news"
#varwidth: References _sv
# The following fields are filled in by flt_pdf
#varwidth: FileName _sv
#fixwidth: NumPages 4 unsigned-integer
#fixwidth: PermanentID 32 text
#fixwidth: InstanceID 32 text
#varwidth: DirID _sv
#fixwidth: WXEVersion 1 unsigned-integer
#varwidth: FTS_Creator _sv
#varwidth: FTS_Producer _sv
#fixwidth: FTS_CreationDate 4 date
# The following fields are filled in by vgw_url
#varwidth: Ext _sv
#varwidth: URL _sv
#varwidth: _Created _sv
#varwidth: _Modified _sv
#fixwidth: Created 4 date
#fixwidth: Modified 4 date
#fixwidth: Size 4 unsigned-integer
# The following fields are filled in by flt_kv
#varwidth: Dictionary _sv
#varwidth: CodePage _sv
#varwidth: Comments _sv
#varwidth: Template _sv
#varwidth: LastAuthor _sv
#varwidth: RevNumber _sv
#fixwidth: PageCount 4 unsigned-integer
#fixwidth: WordCount 4 unsigned-integer
#fixwidth: CharCount 4 unsigned-integer
#varwidth: AppName _sv
#varwidth: ThumbNail _sv
#fixwidth: Doc_Security 4 unsigned-integer
}

Custom User Fields-In the style.ufl File

The style.ufl file should be used to add custom user fields for the application. The default style.ufl file for each gateway does not contain field definitions. The syntax for style.ufl is identical to the syntax for the style.ddd file.

The style.ufl file is referenced by the style.xfl file, which is in turn referenced by the style.ddd file. Do not edit the style.xfl or style.ddd files; instead, add your custom fields here to the style.ufl file, or change the state of fields in the style.sfl file.

style.ufl Contents

The default style.ufl file for the File System gateway is shown below.


#
# style.ufl - Application-specific User Fields
#
# These fields are included in the internal documents table. For
# more information about adding fields to the internal documents
# table, see the "Defining Custom Fields" chapter in the
# Collection Reference Guide.
#
# Example:
#
# data-table: ddf
# {
# varwidth: MyTitle dxa
# }
# ----------------------------------------------------------------
# Specify additional application-specific fields here in their own
# data-table[s].




Copyright © 2002, Verity, Inc. All rights reserved.