Example: parsing an XML document one segment at a time

This example shows the parsing of a document one segment at a time. The program must be compiled using the XMLPARSE(XMLSS) compiler option.

The example shows the XML content of a file, the program that reads and submits XML text to the parser, and the sequence of events that results from parsing the input records.

Content of infile

The XML document that will be parsed a segment at a time is contained in file infile, shown below.


<?xml version='1.0'?>
<Tagline>
COBOL is the language of the future!
</Tagline>

Program PARSESEG

Program PARSESEG reads a segment (a record) of the XML document from file infile, then passes the record to the parser using the XML PARSE statement. The parser processes the XML text and transfers control to the processing procedure for each XML event. The processing procedure handles each event and returns to the parser.

At the end of the segment, the parser sets XML-EVENT to END-OF-INPUT, sets XML-CODE to zero, and transfers control to the processing procedure. The processing procedure reads the next XML record into the parse data item, sets XML-CODE to one, and returns to the parser.

The exchange between the processing procedure and the parser continues until the READ statement returns the end-of-file status code. The processing procedure returns to the parser with XML-CODE still set to zero to indicate the end of segment processing.


  Identification division.                                                 
  Program-id. PARSESEG.                                                    
  Environment division.                                                    
  Input-output section.                                                    
  File-control.                                                            
      Select Input-XML                                                     
       Assign to infile                                                   
       File status is Input-XML-status.                                   
  Data division.                                                           
  File section.                                                            
  FD Input-XML                                                             
      Record is varying from 1 to 255 depending on Rec-length              
      Recording mode V.                                                    
  1 fdrec.                                                                 
    2 pic X occurs 1 to 255 depending on Rec-length .                      
  Working-storage section.                                                 
  1 Event-number comp pic 99.                                              
  1 Rec-length comp-5 pic 9(4).                                            
  1 Input-XML-status pic 99.                                               
  Procedure division.                                                      
      Open input Input-XML                                                 
      If Input-XML-status not = 0                                          
        Display 'Open failed, file status: '  Input-XML-status             
        Goback                                                             
      End-if                                                               
      Read Input-XML                                                       
      If Input-XML-status not = 0                                          
        Display 'Read failed, file status: '  Input-XML-status             
        Goback                                                             
      End-if                                                               
      Move 0 to Event-number                                               
      Display 'Starting with: ' fdrec                                     
      Display 'Event number and name    Content of XML-text'               
      XML parse fdrec processing procedure Handle-parse-events             
      Close Input-XML                                                      
      Goback                                                               
      .                                                                    
  Handle-parse-events.                                                     
      Add 1 to Event-number                                                
      Display '  ' Event-number ': ' XML-event '{' XML-text '}'            
      Evaluate XML-event                                                   
        When 'END-OF-INPUT'                                                
          Read Input-XML                                                   
          Evaluate Input-XML-status                                        
            When 0                                                         
              Move 1 to XML-code                                           
              Display 'Continuing with: ' fdrec                            
            When 10                                                        
              Display 'At EOF; no more input.'                             
            When other                                                     
              Display 'Read failed, file status:' Input-XML-status         
              Goback                                                       
          End-evaluate                                                     
        When other                                                         
          Continue                                                         
      End-evaluate                                                         
           .                                                                    
  End program PARSESEG.                                                    

Results from parsing

To show parsing results, the processing procedure displayed each record of input, followed by the sequence of XML events and any associated text fragments in XML-TEXT. The content of XML-TEXT is displayed in braces ({}); empty braces signify that XML-TEXT is empty.

Notice the extra zero-length CONTENT-CHARACTERS XML event at event number 08. (Such anomalies are typical when supplying XML text piecemeal.)


Starting with:   <?xml version='1.0'?> 
Event number and name      Content of XML-TEXT
  01: START-OF-DOCUMENT      {} 
  02: VERSION-INFORMATION    {1.0} 
  03: END-OF-INPUT           {} 
Continuing with:    <Tagline> 
  04: START-OF-ELEMENT       {Tagline} 
  05: END-OF-INPUT           {} 
Continuing with:    COBOL is the language of the future! 
  06: CONTENT-CHARACTERS     {COBOL is the language of the future!} 
  07: END-OF-INPUT           {} 
Continuing with:    </Tagline> 
  08: CONTENT-CHARACTERS     {} 
  09: END-OF-ELEMENT         {Tagline} 
  10: END-OF-DOCUMENT        {} 

For a detailed description of the XML events that were detected, see the related reference about XML-EVENT.

related references
XMLPARSE (compiler option)  
XML-EVENT (Enterprise COBOL for z/OS® Language Reference)