Using the style.dft File


A style.dft file is referred to as a document format file, since this file contains specifications that override the default virtual document definition. The dispatch field consists of the text of the document which begins in row one, column one of the display.

To override the default virtual document definition, you must include a style.dft file in the style directory used for creating your collection.

NOTE: If you create a style.dft file that contains any fields other than a single dispatch field, and the dispatch field is filtered, your application will be unable to get the raw binary stream from the Verity engine.

style.dft File Syntax

A sample style.dft file called wsjstyle.dft, is shown below. The sample file illustrates how to add Verity collection fields to the document layout. In this case, the document layout includes these elements:

Using the style.dft file above, the Verity engine invokes the ASCII filter.

style.dft File Statements

The description for the style.dft file syntax for statements is provided below.

Element
Description
$control: 1
The $control statement is the first noncomment line in the style.dft file. This statement identifies the file as a Verity control file.
dft:
The dft statement identifies the control file as a style.dft file and it must appear on the second noncomment line in a style.dft file. There are three optional modifiers for the dft statement. Modifiers assigned to the dft statement apply to all values specified in the keyword statements.
/fill = yes|no
This optional modifier to the dft statement identifies whether a newline is created if a newline character appears in the field value or constant. By default, newlines are retained (/fill=no). If you enter /fill=yes, a single newline character in the field value or constant is absent in a document window, and two newline characters in a row are displayed as one.
/right-margin = margin_num
This optional modifier to the dft statement identifies the right margin of the field value or constant to be displayed in a document window. The right margin is expressed as an integer, and the default right margin is 0.
/tabsize = tab_chars
This optional modifier to the dft statement identifies the indent created in a document window when a tab character appears in the field value or constant. The indent created is expressed as a number of characters, and by default a tab character is translated into an 8-character indent.

style.dft File Keywords

The description for the style.dft file syntax for keywords is provided below.

Element
Description
field: fieldname
This keyword specifies the name of a field as defined in the internal documents table that you want displayed with each document. These optional modifiers can be used with the field keyword: The /filter modifier specifies which filter to use.
The /charmap modifier specifies which character map to use to map the textual output of the filters or gateways into the internal character set.
The /filter and /charmap modifiers are described in detail in the next section,
"style.dft Keyword Modifiers."
constant: "string"
This keyword specifies a string that you want displayed with each document. The string to be displayed can contain a maximum of 132 characters, and if the string contains white space, the entire string must be enclosed in quotation marks.
system: "syscall"
This keyword specifies a system call that you want the Verity engine to execute to produce text that you want redirected to the virtual document. To specify a parameter for a field, precede the field name with a dollar sign ($). For example, for a field named title, you could enter $title in a system call. The $$ special parameter represents the name of a temporary file to hold the output of the system call; text in the temporary file is redirected to the virtual document definition for each document. For example, this system keyword specifies a script named myscript taking the title field as a parameter. system: "myscript $title > $$" The output of myscript is redirected to the virtual document.
zone-begin: zone_name
This keyword specifies a zone name that identifies the beginning of the zone to include in the virtual document. A special zone named "noextract" can be used to specify hidden elements (text) in the virtual document; hidden elements get indexed but cannot be viewed. To implement hidden elements, see "Hidden Elements in Zones" in Chapter 8. For complete information about zones, refer to "Defining Zones for Virtual Documents" in Chapter 8.
zone-end: zone_name
This keyword specifies a zone name that identifies the end of the zone to include in the virtual document. A special zone named "noextract" can be used to specify hidden elements (text) in the virtual document; hidden elements get indexed but cannot be viewed. To implement hidden elements, see "Hidden Elements in Zones" in Chapter 8. For complete information about zones, refer to "Defining Zones for Virtual Documents" in Chapter 8.

Shorthand Notation for zone-begin and zone-end

A shorthand notation exists for the zone-begin and zone-end combination. You can use instead the /zone construct. For example, as an alternative to the following:


zone-begin: zname
field: fname
zone-end: zname
you could substitute the following:


field: fname
/zone = zname

style.dft Keyword Modifiers

The style.dft file keywords can include one or more modifiers described below. The dft statement can have a maximum of three modifiers, and there are several more modifiers available for the keywords.

The modifiers available for the dft statement are also available for the keywords. The modifiers for the dft statement are global variables for the keyword elements. If a modifier for a keyword exists also as a modifier for the dft statement, the keyword modifier takes precedence.

Modifier
Description
/filter="value"
This modifier specifies which filter will be used. If not specified, the internal ASCII filter will be used. Valid values are:

universal for the universal filter (the default; see
"Hidden Elements in Zones" in Chapter 8) ;
flt_pdf for the PDF filter (see "The PDF Filter" later in this chapter);
flt_xml for the XML filter; zone [-mode] for the zone filter (see Chapter 8, "The Zone Filter.").

It is recommended that you use the universal filter for all filtering due to its superior performance and handling of character sets.
/charmap
This modifier is used to specify the character set that the document is written in. The search engine will automatically character map the text of the document into the internal character set if necessary before it is indexed or viewed. This modifier is required to properly map any document containing non-ASCII characters. These character map codes can be entered as follows for the Western European languages:
1252 for code page 1252;
850 for IBM code page 850;
8859 for ISO-8859;
mac1 for Macintosh systems.
For Asian localizations, you can enter a character map defined for the locale.

/fill
This optional modifier specifies whether a newline is created if a newline character appears in the field value or constant. By default, newlines are retained (/fill=no). If you enter /fill=yes, a single newline character in the field value or constant is absent in a document window, and two newline characters in a row are displayed as one. If specified, the fill option given overrides the fill option selected in the same modifier in the dft statement.
/right-margin
This optional modifier specifies the right margin of the field value or constant to be displayed in a document window. The right margin is expressed as an integer, and the default right margin is zero. If specified, the right margin given overrides the right margin specified in the same modifier in the dft statement.
/tabsize
This optional modifier specifies the indent created in a document window when a tab character appears in the field value or constant. The indent created is expressed as a number of characters, and by default a tab character is translated into an 8-character indent. If specified, the tab given overrides the tab specified in the same modifier in the dft statement.
/row
This optional modifier specifies the row number in which the field value or string will be displayed. The first row of a virtual document display is row one.
/col
This optional modifier specifies the row number in which the first character of the field value or string will be displayed. The left-most column of a virtual document display is column one.
/delta-row
The optional modifier specifies a row number relative to the text above it where you want the field value or string displayed. For example, if field is defined to appear in row four, and you specify /delta-row=2 for a second field, the second field appears two rows ahead, in row six.
/delta-col
This optional modifier specifies a column relative to the right-most character in a row where you want the first character of a field value or string displayed. For example, if a field is defined to appear in row three from columns five to 15, and you specify /delta-col=5, the second field will appear five columns ahead, beginning in column 20.
/hidden
This optional modifier specifies that one or more fields are defined as a hidden elements. Valid values are: YES to treat the zone's fields as hidden elements;
NO (default) to not treat the zone's fields as hidden elements.
A special zone named "noextract" is used to specify hidden elements (text) in the virtual document; hidden elements get indexed but cannot be viewed. To implement hidden elements, see "Hidden Elements in Zones" in Chapter 8. For complete information about zones, refer to "Defining Zones for Virtual Documents" in Chapter 8.

Date Formats in the style.dft File

If one of the fields in your style.dft file is a date field, you must use the same date output format for both indexing and viewing. If you do not, you may have incorrect highlights (unless your retrieval client uses dynamic highlighting).

Late Binding for Field Elements

The Verity engine uses late binding for field elements in the virtual document, meaning that the value of a field specified in the style.dft is not read until the field element is actually inserted into the stream. This enables field values populated by gateways and filters, such as HTML META tags and Microsoft Office properties, to be added to the document stream following the text of the main document.

It is not possible to capture the values of the VdkSummary and VdkFeatures field in the virtual document because these fields are generated after the entire virtual document has already been streamed.





Copyright © 2002, Verity, Inc. All rights reserved.