Using merge


The merge command-line tool lets you combine multiple collections with identical schemas. This is useful for merging smaller collections built from different sources into one, large collection. Also, you can use the merge command-line tool to break up the collection into smaller collections of a roughly uniform size.

It is important to note that collections can be merged only if they have identical schemas. Collections can be merged if they have exactly the same set of style files (and style file entries).

Breaking up a large collection helps to optimize search performance, because it allows many applications to perform multiple concurrent search requests over the different collections. After breaking up a large collection, you can also discard older collections to reclaim limited disk storage space.

merge can be found in the Verity bin directory. In a typical installation, the path is:

/verity/prdname/k2/_platform/bin/merge

where verity/prdname represents the user-definable portion of the Verity installation directory name, and _platform represents the platform name (like _nti40 for Windows NT v4.0).

To obtain help for the merge command-line tool, enter the following command:

merge -help

NOTE: After running the merge command-line tool, you must optimize the collection, using the mkvdk -optimize option.

Merging Collections

The following is the syntax for using the merge command-line tool to merge multiple collections into a single collection:

merge newCollection srcCollection1 srcCollection2
[srcCollectionN]

The command-line tool reads srcCollection1, srcCollection2 and so on and merges them into a single collection with the directory name given for newCollection If the directory name given for newCollection doesn't exist, then it is created.

Splitting Collections

The following is the syntax for using the merge command-line tool to split a single large collection into smaller collections:

merge -split srcCollection newCollection1 newCollection2
[-number]

The command-line tool reads srcCollection and splits it in roughly equal-sized pieces, using the file names given for newCollection1 and so on.

If you want to split a very large collection into a large number of new collections, you can use the following option instead of explicitly naming each new collection:

merge -split -number newCollection srcCollection

The command-line tool reads the collection identified by srcCollection and splits it into the number of segments specified by the -number option. The name of the first new collection is generated by appending the first two letters in the alphabet (aa) to the directory name given for newCollection. Each subsequent file name is generated by incrementing one of the appended letters (up to zz) for a maximum of 676 partitions. For example, if the value of -number is 3, and the value of newCollection is Collection1, the collections are named, Collection1aa, Collection1ab, and Collection1ac.

NOTE: The maximum length of the directory name given for newCollection is 2 characters less than the length allowed by the file system.





Copyright © 2002, Verity, Inc. All rights reserved.