Simple Query Parser
The simple query parser supports searching over the full text of documents in addition to searching over collection fields and zones. Sometimes the simple parser is referred to as the "full text" parser. The simple parser interprets Verity query language.
A unique feature of the simple query parser is that it can translate a query expression supplied by the user into a more robust query form without requiring the user to specify a lot of syntax. For example, if a user enters a single word, the simple query parser applies the MANY modifier and STEM operator to the word by default. This more robust query form, specifically "<MANY> <STEM> word" causes the search engine to search for a broader range of documents containing evidence of the user's query.
Behaviors of the simple query parser are listed below.
- 1. An individual word is interpreted as a stemmed word or a topic name, unless the word is surrounded by double quotation marks. When processing the search, the search engine first checks to see whether the word matches a topic name, and if a match is found, the topic is used. If a match is not found or if topics are not implemented, the word is interpreted as a stemmed word. The MANY modifier and the STEM operator are applied to a single word.
- 2. When a word is interpreted as a stemmed word, the search engine broadens the search to include the word itself along with the stemmed variations of the word. For example, if a user enters the word "meet" (without double quotation marks, as in: meet) in the search form and then starts a search, the search engine will look for these words: "meet," "meets," and "meeting."
- 3. When matching a query expression (including two or more words) against topic names, spaces are interpreted as hyphens. This means that if a phrase named BIG COMPANIES is supplied as part of a query expression, the application looks for the topic BIG-COMPANIES. If a topic name match is found, the topic is used to process the search.
- 4. To specify a literal word so that words that have the same stem will not be considered in the search, the user can surround the word with double quotation marks. For example, to search for documents that contain the word "tropic" and not consider words that have the same stem, such as "tropics" or "tropical," the word "tropic" needs to be surrounded with double quotation marks.
- 5. The PHRASE operator is applied to a phrase where a phrase is defined to be two or more words separated by spaces.
- 6. Queries are case-insensitive when the query terms are entered in lowercase or uppercase characters. Queries are case-sensitive when the query terms are entered in mixed case characters. To force case-sensitive searches for words and phrases, users can use the CASE modifier in queries.
- 7. Special meaning is assigned to the following words in a query expression: AND, OR, NOT. These words are interpreted as Verity query language, unless they are enclosed in double quotation marks.
- For example, to search for the phrase "recycle and reuse," ensuring that the word "and" is not interpreted as an operator, the following query can be used:
- recycle "and" reuse
- NOTE: Special characters such as "&" and "|" must also be enclosed in quotes.
- 8. The Verity query language can be used to perform zone and field searches. The zones that are available for searching depends on the type of documents in the collection.
Copyright © 2001, Verity, Inc. All rights
reserved.