Documentation
(C) IBM Corp. 1996, 1999

Text Extender: Administration and Programming

DesGetMatches

Purpose

Returns a data stream containing highlighting information for the text document described by a document handle. See Data stream syntax. The highlight information comprises the text context (at least one paragraph) and information for highlighting text in that context.

DesGetMatches returns only a portion of the data stream, indicating the length of the portion in the output structure.

A sequence of calls to DesGetMatches gets the entire text document content. When the end of the text document is reached, RC_SE_END_OF_INFORMATION is returned.

Syntax

DESRETURN
  DesGetMatches
     (DESBROWSESESSION    BrowseSession,
      DESHANDLE           DocumentHandle,
      DESMATCHINFO        *pMatchInfo,
      DESULONG            *pMatchInfoLength,
      DESMESSAGE          *pErrorMessage);

Function arguments

Table 11. DesGetMatches arguments
Data Type Argument Use Description
DESBROWSESESSION BrowseSession input Browse session handle.
DESHANDLE DocumentHandle input Document handle returned by DesOpenDocument.
DESMATCHINFO * pMatchInfo output Pointer to a buffer containing the data stream portion received. DesGetMatches allocates that buffer.
DESULONG * pMatchInfoLength output The length of the data stream portion pointed to by pMatchInfo.
DESMESSAGE * pErrorMessage output Implementation-defined message text. If an error occurs, Text Extender returns an error code and an error message. The application program allocates the buffer of size DES_MAX_MESSAGE_LENGTH. If pErrorMessage is the null pointer, no error message is returned.

Data stream syntax

>>- --05-- --DB2TX_DOC-- --DB2TX_START-- ----------------------->
 
>-----+-------------------------------------------------------------+>
      '- --ll-- --DB2TX_DNAM-- --DB2TX_ATOMIC-- --document_name-- --'
 
      .----------------.
      V                |
>--------| Section |---+-- --05-- --DB2TX_DOC-- --DB2TX_END-- --><
 
Section
 
|--- --05-- --DB2TX_DEL-- --DB2TX_START-- ---------------------->
 
>-----+------------------------------------------------------------+>
      '- --ll-- --DB2TX_SNAM-- --DB2TX_ATOMIC-- --section_name-- --'
 
      .-------------------------------------------.
      |                     .------------------.  |
      V                     V                  |  |
>--------| Text encoding |-----| Paragraph |---+--+------------->
 
>---- --05-- --DB2TX_DEL-- --DB2TX_END-- -----------------------|
 
Text encoding
 
|--- --07-- --DB2TX_CCSID-- --DB2TX_ATOMIC-- --coded_character_set_identifier-- -->
 
>--- --07-- --DB2TX_LANG-- --DB2TX_ATOMIC-- --language_identifier-- -->
 
>---------------------------------------------------------------|
 
Paragraph
 
|--- --05-- --DB2TX_PAR-- --DB2TX_START-- ---------------------->
 
      .-----------------------.
      V                       |
>--------| Paragraph text |---+--------------------------------->
 
>---- --05-- --DB2TX_PAR-- --DB2TX_END-- -----------------------|
 
Paragraph text
 
|---+- --ll-- --DB2TX_TEXT-- --DB2TX_ATOMIC-- --text_unit-- --+->
    '- --ll-- --DB2TX_LINK-- --DB2TX_ATOMIC-- --media_ref-- --'
 
      .-----------------------------------------------.
      V                                               |
>--------+-----------------------------------------+--+--------->
         '- --05-- --DB2TX_NL-- --DB2TX_ATOMIC-- --'
 
      .------------------------------------------------------------------------.
      V                                                                        |
>--------+------------------------------------------------------------------+--+>
         '- --ll-- --DB2TX_MATCH-- --DB2TX_ATOMIC-- --match_information-- --'
 
>---------------------------------------------------------------|
 

 

Each segment in the syntax diagram, such as 05 DB2TX_DOC DB2TX_START begins with a length field of type integer, which in the diagram is either an explicit number, such as 05, or a variable ll. The length of the segment includes the 2-byte length field.
Note:The length is in big-endian format.

Each segment includes one of the following 1-byte type identifiers:

DB2TX_START
Indicates the start of a segment, such as a document or a paragraph.

DB2TX_END
Indicates the end of a segment.

DB2TX_ATOMIC
Indicates that the item that follows is atomic, such as a document name or a language identifier.

The data stream items are each two bytes long. They are:

DB2TX_DOC
Indicates the start and end of a document.

DB2TX_DNAM
A document name. If no name is specified, the identifier of the document is used.

DB2TX_DEL
Indicates the start and end of a document element. The only type of document element currently supported is a text section.

DB2TX_SNAM
Specifies the name of a text section. Currently Text Extender supports only one text section and automatically supplies a default name. If you specify a section name, it is ignored.

DB2TX_PAR
Indicates the start and end of a text paragraph within the current section.

DB2TX_TEXT
Specifies one text portion within the current paragraph. Usually, text unit contains one line of text, and the TEXT item is followed by a DB2TX_NL item; but text lines may also be split into several parts, each part specified in its own DB2TX_TEXT item.

The text uses the CCSID and language associated with the current paragraph.

DB2TX_LINK
Specifies a Text Extender hypermedia reference. It uses the CCSID of the current paragraph.

DB2TX_NL
Indicates the start of a new line in the current paragraph.

DB2TX_MATCH
Contains occurrence information for matches in the current text portion. The information is supplied as a sequence of binary number pairs. The first number in each pair is the offset of a match within the current text portion, the second number is the length, in characters, of that match. The given length could exceed the given text portion. Both offset and length are two-byte values specified in big-endian format.

DB2TX_CCSID
The CCSID for text in subsequent paragraphs until a paragraph is preceded by a new DB2TX_CCSID item. The following CCSIDs are returned:

DB2TX_CCSID_00500
for text in the Latin-1 EBCDIC codepage 500.

DB2TX_CCSID_04946
for text in the Latin-1 ASCII codepage 850.

DB2TX_CCSID_00819
for text in the ASCII codepage 819.

These symbolic names for CCSIDs are defined in the file DES_EXT.H provided with the Text Extender. The two-byte binary values are specified in big-endian format.

DB2TX_LANG
The language identifier for text in subsequent paragraphs until a paragraph is preceded by a new DB2TX_LANG item. File DES_EXT.H provided with Text Extender defines symbolic names for all language identifiers supported by Text Extender. The two-byte binary values are specified in big-endian format.

Usage

DesGetMatches returns RC_SE_END_OF_INFORMATION when the end of the text document is reached.

Return codes

RC_SUCCESS
RC_SE_END_OF_INFORMATION
 
RC_INVALID_PARAMETER
RC_INVALID_SESSION
RC_SE_CAPACITY_LIMIT_EXCEEDED
RC_SE_INCORRECT_HANDLE
RC_SE_IO_PROBLEM
RC_SE_NOT_ENOUGH_MEMORY
RC_SE_REQUEST_IN_PROGRESS
RC_SE_LS_FUNCTION_FAILED
RC_SE_UNEXPECTED_ERROR
 
Warnings: The following return codes indicate that the function has returned a result, but it may not be as expected.
 
RC_SE_DICTIONARY_NOT_FOUND

Restrictions

This function can be called only after you have opened a text document by calling DesOpenDocument.


[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]