Handling splits using the XML-INFORMATION special register

You can parse large XML documents by using the XML-INFORMATION special register.

To use this feature, compile your program with the XMLPARSE(XMLSS) compiler option in effect.

Splits in character content might occur at arbitrary points in the XML data stream, even with unsegmented input. The XML-INFORMATION special register simplifies the reassembly of content. This register may be required for any and all attribute values and element character content.

The length of the parse data item is evaluated for each segment, and determines the segment length.

The example, Example: program for processing XML, demonstrates various ways of assigning values obtained from the XML document to program data items for later processing.

The XML data is provided to the parser in 40-byte records, imitating the way an XML document might be acquired from an external source such as a data file. The record boundaries are designed so that all data splits but one are accommodated by the parser. For example, the sample treats as an error a split in any content except the content of the "filling" element.

In the example, the XML-INFORMATION special register is only used to simplify the reassembly of content for the "filling" element. This register could be used for any attribute values and element character content. An XML-INFORMATION value of 2 indicates that the character data for an ATTRIBUTE-CHARACTERS or CONTENT-CHARACTERS XML event is continued in a subsequent XML event, and should thus be buffered in order to accumulate the complete character string. A subsequent XML event of the same type with an XML-INFORMATION value of 1 indicates that XML-TEXT or XML-NTEXT contains the final piece of the character content, and that the complete string can be moved to the appropriate data item.

In the example, the STRING ... WITH POINTER statement accumulates and describes properly the complete character value for assignment to the "filling" identifier.

                   String xml-text delimited by size into
                       content-buffer with pointer tally
                     On overflow
                       Display 'content buffer ('
                           length of content-buffer
                           ' bytes) is too small'
                       Move -1 to xml-code
                   End-string

related concepts
XML events  
XML-CODE

related references
XMLPARSE (compiler option)  
Example: program for processing XML