>>-+---------------------------------------------------+--------> '-THESAURUS--"thesaurus-name"--+-----------------+--' '-COUNT--"depth"--' >-----+-----------------------+---------------------------------> '-RESULT LIMIT--number--' >-----+-| boolean-argument |--+---------------------------+---+->< | '-&--| freetext-argument |--' | '-+--------------------------+---| freetext-argument |--' '-| boolean-argument |--&--' boolean-argument .-& or |--------------------------------------------------------------. V | |-----+-| search-factor |----------------------------------------------+--+-> | .---------------------------------. | | V | | '-(--| search-factor |------+- & -+---| search-factor |---+---)--' '- | -' >---------------------------------------------------------------| freetext-argument |---IS ABOUT----+-----------------+---+-----------+-------------> +-SYNONYM FORM OF-+ '-language--' +-feature---------+ '-| thesaurus |---' >----"phrase-or-sentence"----+------------------------+---------| '-ESCAPE--"escape-char"--' search-factor |---+---------------------------------------------------------------------+-> | .-,---------------. | | V | | '-+--------------------+---+-SECTION--+--(-----section-name---+---)---' '-MODEL--model-name--' '-SECTIONS-' >---| search-element |------------------------------------------| search-element |---+-+-----+--| search-primary |--------------------------------------+-> | '-NOT-' | | .-AND---------------. | | V | | '-| s.-primary |--+-IN SAME PARAGRAPH AS-+------| s.-primary |---+-' '-IN SAME SENTENCE AS-' >---------------------------------------------------------------| search-primary |---+-| search-atom |----------------+--------------------------| | .-,------------------. | | V | | '-(-----| search-atom |---+---)--' search-atom |---+-----------------------------+---+-----------+-------------> +-PRECISE FORM OF-------------+ '-language--' +-STEMMED FORM OF-------------+ +-FUZZY FORM OF--match-level--+ +-SYNONYM FORM OF-------------+ +-BOUND-----------------------+ +-SOUNDS LIKE-----------------+ +-feature---------------------+ '-| thesaurus |---------------' >----"word-or-phrase"----+-----------------------------+--------| '-ESCAPE--"escape-character"--' thesaurus (if THESAURUS is specified) |---+---------------------+---TERM OF---------------------------| '-EXPAND--"relation"--'
Examples
Examples are given in Specifying search arguments.
Search parameters
An option that lets you specify a free-text search argument, that is, a natural-language phrase or sentence that describes the concept to be found. See Free-text and hybrid search.
The model name must be specified in a document model file described in Working with structured documents. The model name can be masked using wildcard characters.
If you do not specify a model, the default model specified during index creation is used.
A keyword used to specify one or more sections that the search is to be restricted to. The section name must be specified in a model in a document model file, described in Working with structured documents. A section name can be masked using wildcard characters % and _.
Sections can be nested within other sections, for example:
play/Act/Title=play/act/title
Restrictions: Searching in nested sections is possible only for documents stored in columns enabled with format XML. For Ngram indexes, only one section name can be searched and XML format is not supported. .
A keyword used to specify the name of the thesaurus to be used to expand the search term. The thesaurus name is the file name (without its extension) of a thesaurus that has been compiled using the thesaurus compiler TXTHESC or TXTHESN. There are default thesauri desthes and desnthes, stored in the sample directory, where desnthes is an Ngram thesaurus. You can also specify the file's path name. The default path name is the dictionary path.
A keyword used to specify the number of levels (the depth) of terms in the thesaurus that are to be used to expand the search term for the given relation. If you do not specify this keyword, a count of 1 is assumed.
A keyword used to specify the maximum number of entries to be returned in the result list. number is a value from 1 to 32767. If a free-text search is used, the search result list is ranked only with respect to the complete search result list. Otherwise, the limited search result is ranked only from the entries of the list.
A keyword used to specify the relation, such as INSTANCE, between the search term specified in TERM OF and the thesaurus terms to be used to expand the search term. The relation name must correspond to a relation used in the thesaurus. See Thesaurus concepts.
For an Ngram thesaurus, use the member-relation name described in Creating an Ngram thesaurus. For user-defined member relations, use :RELATION n where n is the member relation number specified in :RELATED (number)
The search term, or multi-word search term, to which other search terms are to be added from the thesaurus.
The logical AND (&) operator binds stronger than the logical OR (|) operator. Example:
"passenger" & "vehicle" | "transport" & "public"
is evaluated as:
("passenger" & "vehicle") | ("transport" & "public")
To search for:
"passenger" & ("vehicle" | "transport") & "public"
you must include the parentheses as shown.
When NOT is used in a search factor, you cannot use the SYNONYM FORM OF keyword.
The following search argument finds text documents containing the term "traffic" only if the term "air" is in the same paragraph.
"traffic" IN SAME PARAGRAPH AS "air"
You cannot use the IN SAME PARAGRAPH AS keyword when NOT is used in a search factor.
The following search argument searches for "forest", "rain", "erosion", and "land" in the same sentence.
"forest" IN SAME SENTENCE AS "rain" AND "erosion" AND "land"
The following statement is true if one or more of the search arguments is found.
CONTAINS (mytexthandle, '( "text", "graphic", "audio", "video")') = 1
The search term processing is described in more detail in Table 6.
Search atom keyword | Index type | |||||
Linguistic | Precise | Precise Normalized | Dual | Ngram | Ngram case- enabled | |
PRECISE FORM OF |
| X | X | X |
| O |
STEMMED FORM OF | X |
|
| O | O | O |
FUZZY FORM OF |
|
|
|
| O | O |
IS ABOUT | O | O | O | O |
|
|
SYNONYM FORM OF | O | O | O | O |
|
|
EXPAND | O | O | O | O |
|
|
SOUNDS LIKE | O | O | O | O |
|
|
IN SAME SENTENCE AS | O | O | O | O | O | O |
IN SAME PARAGRAPH AS | O | O | O | O | O | O |
BOUND |
|
|
|
| O | O |
X=default setting O=function available |
Table 6. Search term options for Ngram indexes
Search atom keyword | Search term processing | ||||
---|---|---|---|---|---|
Case | Stemming | Match | |||
Sensitive | Insensitive | Exact | Fuzzy | ||
PRECISE FORM OF | when case-enabled | X |
| X |
|
STEMMED FORM OF |
| X | X |
|
|
FUZZY FORM OF |
| X |
|
| X |
X=default setting |
If you use a keyword that is not available for that index type, it is ignored and either the default keyword is used instead, or a message is returned.
This is the default option for precise and dual indexes. For a precise normalized index, the default form of search is not case-sensitive. If you specify this keyword for a linguistic index, it is ignored and STEMMED FORM OF is assumed.
The way in which words are reduced to their stem form is language-dependent.
Example: programming computer systems is replaced by program compute system when you use the US-English dictionary, and by programme compute system when you use the UK-English dictionary.
This search phrase can find "programmer computes system", "program computing systems", "programming computer system", and so on.
This is the default option for linguistic indexes. If you specify this keyword for a precise index, it is ignored and PRECISE FORM OF is assumed instead.
match-level: An integer from 1 to 5 specifying the degree of similarity, where 5 is more similar than 1.
Synonyms for a phrase are alternative phrases containing all the possible combinations of synonyms that can be obtained by replacing each word of the original phrase by one of its synonyms. The word sequence remains as in the original phrase.
If you specify this keyword for a precise index, it is ignored and PRECISE FORM OF is assumed instead.
If you specify this keyword for a dual index, the search is made using the linguistic part of the dual index rather than the precise part.
You cannot specify this keyword when NOT is used in the search factor, or when the word or phrase to be searched for contains masking characters.
Linguistic processing includes synonym processing and word-stem processing. See The supported languages for more information.
The supported languages are listed in Languages.
Note: | When searching in documents that are not in U.S. English, you must specify the language in the search argument regardless of the default language. |
Precise or linguistic search. Text Extender can search using either the precise form of the word or phrase, or a variation of it. If you do not specify one of the options in Table 5, the default linguistic options are used according to which type of index is being used.
To search for a character string that contains double quotation marks, type the double quotation marks twice. For example, to search for the text "wildcard" character, use:
"""wildcard"" character"
Masking characters. A word can contain the following masking characters:
A word cannot be composed exclusively of masking characters, except when a single % is used to represent an optional word.
If you use a masking character, you cannot use SYNONYM FORM OF, feature, or THESAURUS.
Example: If escape-character is $, then $%, $_, and $$ represent %, _, and $ respectively. Any % and _ characters not preceded by $ represent masking characters.
Summary of rules and restrictions