Using didump


Using the didump command-line tool, you can view the word index components by partition. The word list consists of a list of all words indexed by the Verity engine. The zone list is a list of all zones found by the engine. The zone attribute list is a list of the zone attributes found by the engine.

didump can be found in the Verity bin directory. In a typical installation, the path is:

/verity/prdname/k2/_platform/bin/didump

where verity/prdname represents the user-definable portion of the installation directory, and _platform represents the platform name (like _ssol26 for Solaris).

Viewing the Word List

You can view the contents of the word list for a partition by using the didump command-line tool with the -words flag. The command-line syntax must include the -words flag and a path name to a partition file, like this:

didump -words /z/collref/html/parts/00000003.did

The display provides an alphabetical listing of the words in the word index, as shown below.


didump - Verity, Inc. Version 4.0.1 (_nti40, Jun 7 2001)
Text Size Doc Word
A 10 3 4
a 34 5 24
abbreviations 4 1 1
about 4 1 1
acronym 5 1 2
acronyms 4 1 1
actual 4 1 1
administrator 3 1 1
advance 3 1 1
all 8 2 3
also 9 2 4
Always 4 1 1
always 9 2 3
ampersand 4 1 1
...
The columns in the display indicate:

To view the occurrences of a specific word or pattern, enter a command using the
-pattern option, as in the following example:

didump -pattern acronym 00000003.did

The didump command-line tool will display information about the number of occurrences of the word "acronym." You can display the individual occurrences of a word using the verbose (-verbose) option.

Viewing the Zone List

The zone list contains a list of the zones identified by the zone filter. The zones listed can be searched using the Verity IN operator in a query. To view the contents of zone list, use didump with the -zones flag plus the path name to a partition, like this:

didump -zones /z/collref/html/parts/00000003.did

The partition above is for a collection containing a document in HTML format. The Verity universal filter invoked the HTML filter by default and indexed the documents using these zones.


didump - Verity, Inc. Version 4.0.1 (_ssol26, Jun 07 2001)
ZoneName Fmt Size Doc Regions
A Wct 10239 85 5016
ADDRESS Array 34 1 1
BODY Array 197 85 85
CAPTION Wct 298 31 85
CODE Wct 3868 66 1829
H1 Array 80 83 83
H2 Wct 646 53 212
H3 Wct 517 49 171
H4 Wct 128 8 47
HEAD Array 70 85 85
HTML Array 165 85 85
TITLE Array 70 85 85
The columns in the display indicate:

For complete information about the how zones are defined, refer to Chapter 8, "The Zone Filter."

Viewing the Zone Attribute List

The zone attribute list contains a list of the HTML attributes for the zones identified by the HTML zone filter. The zone attributes listed can be searched using the Verity IN operator together with the WHEN operator in a query. To view the contents of the zone attributes list, use didump with the -attributes flag plus the path name to a partition, like this:

didump -attributes /z/collbldg/html/parts/00000003.did

The partition above is for a collection containing the Verity Collection Reference Guide in HTML format.


didump - Verity, Inc. Version 4.0.1 (_ssol26, Jun 7 2001)
Text Size Doc Word
href 01_cbg.htm 10 2 4
href 01_cbg.htm#282870 3 1 1
href 01_cbg.htm#282872 6 2 2
href 01_cbg1.htm 8 2 3
href 01_cbg1.htm#286513 7 2 2
href 01_cbg1.htm#286520 3 1 1
...
The columns in the display indicate:





Copyright © 2002, Verity, Inc. All rights reserved.