General Description

The parsing instructions are ARG, PARSE, and PULL (see section ARG, section PARSE, and section PULL).

The data to parse is a source string. Parsing splits up the data in a source string and assigns pieces of it into the variables named in a template. A template is a model specifying how to split the source string. The simplest kind of template consists of only a list of variable names. Here is an example:

variable1 variable2 variable3

This kind of template parses the source string into blank-delimited words. More complicated templates contain patterns in addition to variable names.

String patterns
Match characters in the source string to specify where to split it. (See section Templates Containing String Patterns for details.)
Positional patterns
Indicate the character positions at which to split the source string. (See section Templates Containing Positional (Numeric) Patterns for details.)

Parsing is essentially a two-step process.

  1. Parse the source string into appropriate substrings using patterns.
  2. Parse each substring into words.

Simple Templates for Parsing into Words

Here is a parsing instruction:

parse value 'time and tide' with var1 var2 var3

The template in this instruction is: var1 var2 var3. The data to parse is between the keywords PARSE VALUE and the keyword WITH, the source string time and tide. Parsing divides the source string into blank-delimited words and assigns them to the variables named in the template as follows:

var1='time'
var2='and'
var3='tide'

In this example, the source string to parse is a literal string, time and tide. In the next example, the source string is a variable.

/* PARSE VALUE using a variable as the source string to parse    */
string='time and tide'
parse value string with var1 var2 var3           /* same results */

(PARSE VALUE does not convert lowercase a-z in the source string to uppercase A-Z. If you want to convert characters to uppercase, use PARSE UPPER VALUE. See Using UPPER for a summary of the effect of parsing instructions on case.)

All of the parsing instructions assign the parts of a source string into the variables named in a template. There are various parsing instructions because of differences in the nature or origin of source strings. (A summary of all the parsing instructions is on page Parsing Instructions Summary.)

The PARSE VAR instruction is similar to PARSE VALUE except that the source string to parse is always a variable. In PARSE VAR, the name of the variable containing the source string follows the keywords PARSE VAR. In the next example, the variable stars contains the source string. The template is star1 star2 star3.

/* PARSE VAR example                                             */
stars='Sirius Polaris Rigil'
parse var stars star1 star2 star3             /* star1='Sirius'  */
                                              /* star2='Polaris' */
                                              /* star3='Rigil'   */

All variables in a template receive new values. If there are more variables in the template than words in the source string, the leftover variables receive null (empty) values. This is true for all parsing: for parsing into words with simple templates and for parsing with templates containing patterns. Here is an example using parsing into words.

/* More variables in template than (words in) the source string  */
satellite='moon'
parse var satellite Earth Mercury               /* Earth='moon'  */
                                                /* Mercury=''    */

If there are more words in the source string than variables in the template, the last variable in the template receives all leftover data. Here is an example:

/* More (words in the) source string than variables in template  */
satellites='moon Io Europa Callisto...'
parse var satellites Earth Jupiter              /* Earth='moon'  */
                               /* Jupiter='Io Europa Callisto...'*/

Parsing into words removes leading and trailing blanks from each word before it is assigned to a variable. The exception to this is the word or group of words assigned to the last variable. The last variable in a template receives leftover data, preserving extra leading and trailing blanks. Here is an example:

/* Preserving extra blanks                                       */
solar5='Mercury Venus  Earth   Mars     Jupiter  '
parse var solar5 var1 var2 var3 var4
/* var1  ='Mercury'                                              */
/* var2  ='Venus'                                                */
/* var3  ='Earth'                                                */
/* var4  ='  Mars     Jupiter  '                                 */

In the source string, Earth has two leading blanks. Parsing removes both of them (the word-separator blank and the extra blank) before assigning var3='Earth'. Mars has three leading blanks. Parsing removes one word-separator blank and keeps the other two leading blanks. It also keeps all five blanks between Mars and Jupiter and both trailing blanks after Jupiter.

Parsing removes no blanks if the template contains only one variable. For example:

parse value '   Pluto   ' with var1        /* var1='   Pluto   '*/

The Period as a Placeholder

A period in a template is a placeholder. It is used instead of a variable name, but it receives no data. It is useful:

The period in the first example is a placeholder. Be sure to separate adjacent periods with spaces; otherwise, an error results.

/* Period as a placeholder                                       */
stars='Arcturus Betelgeuse Sirius Rigil'
parse var stars . . brightest .            /* brightest='Sirius' */

/* Alternative to period as placeholder                          */
stars='Arcturus Betelgeuse Sirius Rigil'
parse var stars drop junk brightest rest   /* brightest='Sirius' */

A placeholder saves the overhead of unneeded variables.

Templates Containing String Patterns

A string pattern matches characters in the source string to indicate where to split it. A string pattern can be a:

Literal string pattern
One or more characters within quotation marks.
Variable string pattern
A variable within parentheses with no plus (+) or minus (-) or equal sign (=) before the left parenthesis. (See page *** for details.)

Here are two templates: a simple template and a template containing a literal string pattern:

var1 var2          /* simple template                            */
var1 ', ' var2     /* template with literal string pattern       */

The literal string pattern is: ', '. This template:

A template with a string pattern can omit some of the data in a source string when assigning data into variables. The next two examples contrast simple templates with templates containing literal string patterns.

/* Simple template                                               */
name='Smith, John'
parse var name ln fn                     /* Assigns: ln='Smith,' */
                                         /*          fn='John'   */

Notice that the comma remains (the variable ln contains 'Smith,'). In the next example the template is ln ', ' fn. This removes the comma.

/* Template with literal string pattern                          */
name='Smith, John'
parse var name ln ', ' fn                /* Assigns: ln='Smith'  */
                                         /*          fn='John'   */

First, the language processor scans the source string for ', '. It splits the source string at that point. The variable ln receives data starting with the first character of the source string and ending with the last character before the match. The variable fn receives data starting with the first character after the match and ending with the end of string.

A template with a string pattern omits data in the source string that matches the pattern. (There is a special case (on page Combining String and Positional Patterns: A Special Case) in which a template with a string pattern does not omit matching data in the source string.) We used the pattern ', ' (with a blank) instead of ',' (no blank) because, without the blank in the pattern, the variable fn receives ' John' (including a blank).

If the source string does not contain a match for a string pattern, then any variables preceding the unmatched string pattern get all the data in question. Any variables after that pattern receive the null string.

A null string is never found. It always matches the end of the source string.

Templates Containing Positional (Numeric) Patterns

A positional pattern is a number that identifies the character position at which to split data in the source string. The number must be a whole number.

An absolute positional pattern is

The number specifies the absolute character position at which to split the source string.

Here is a template with absolute positional patterns:

variable1 11 variable2 21 variable3

The numbers 11 and 21 are absolute positional patterns. The number 11 refers to the 11th position in the input string, 21 to the 21st position. This template:

Positional patterns are probably most useful for working with a file of records, such as:

          character positions:
          1         11         21                  40
         *----------*----------*--------------------*end of
 FIELDS: |LASTNAME  |FIRST     |PSEUDONYM           |record
         *----------*----------*--------------------*

The following example uses this record structure.

/* Parsing with absolute positional patterns in template         */
record.1='Clemens   Samuel    Mark Twain          '
record.2='Evans     Mary Ann  George Eliot        '
record.3='Munro     H.H.      Saki                '
do n=1 to 3
  parse var record.n lastname 11 firstname 21 pseudonym
  If lastname='Evans' & firstname='Mary Ann' then say 'By George!'
end                         /* Says 'By George!' after record 2  */

The source string is first split at character position 11 and at position 21. The language processor assigns characters 1 to 10 into lastname, characters 11 to 20 into firstname, and characters 21 to 40 into pseudonym.

The template could have been:

1 lastname 11 firstname 21 pseudonym

instead of

  lastname 11 firstname 21 pseudonym

Specifying the 1 is optional.

Optionally, you can put an equal sign before a number in a template. An equal sign is the same as no sign before a number in a template. The number refers to a particular character position in the source string. These two templates work the same:

lastname  11 first  21 pseudonym

lastname =11 first =21 pseudonym

A relative positional pattern is a number with a plus (+) or minus (-) sign preceding it. (It can also be a variable within parentheses, with a plus (+) or minus (-) sign preceding the left parenthesis; for details see section Parsing with Variable Patterns.)

The number specifies the relative character position at which to split the source string. The plus or minus indicates movement right or left, respectively, from the start of the string (for the first pattern) or from the position of the last match. The position of the last match is the first character of the last match. Here is the same example as for absolute positional patterns done with relative positional patterns:

/* Parsing with relative positional patterns in template         */
record.1='Clemens   Samuel    Mark Twain          '
record.2='Evans     Mary Ann  George Eliot        '
record.3='Munro     H.H.      Saki                '
do n=1 to 3
  parse var record.n lastname +10 firstname + 10 pseudonym
  If lastname='Evans' & firstname='Mary Ann' then say 'By George!'
end                                             /* same results  */

Blanks between the sign and the number are insignificant. Therefore, +10 and + 10 have the same meaning. Note that +0 is a valid relative positional pattern.

Absolute and relative positional patterns are interchangeable (except in the special case (on page Combining String and Positional Patterns: A Special Case) when a string pattern precedes a variable name and a positional pattern follows the variable name). The templates from the examples of absolute and relative positional patterns give the same results.

|      |   |lastname  11|   |firstname 21  | | pseudonym |
|      |   |lastname +10|   |firstname + 10| | pseudonym |
*--*---*   *------*-----*   *------*-------* *-----*-----*
   *              *                *               *
(Implied   Put characters    Put characters   Put characters
starting   1 through 10      11 through 20    21 through
point is   in lastname.      in firstname.    end of string
position   (Non-inclusive    (Non-inclusive   in pseudonym.
1.)        stopping point    stopping point
           is 11 (1+10).)    is 21 (11+10).)

Only with positional patterns can a matching operation back up to an earlier position in the source string. Here is an example using absolute positional patterns:

/* Backing up to an earlier position (with absolute positional)  */
string='astronomers'
parse var string 2 var1 4 1 var2 2 4 var3 5 11 var4
say string 'study' var1||var2||var3||var4
/* Displays: "astronomers study stars"                           */

The absolute positional pattern 1 backs up to the first character in the source string.

With relative positional patterns, a number preceded by a minus sign backs up to an earlier position. Here is the same example using relative positional patterns:

/* Backing up to an earlier position (with relative positional)  */
string='astronomers'
parse var string 2 var1 +2 -3 var2 +1 +2 var3 +1 +6 var4
say string 'study' var1||var2||var3||var4      /* same results   */

In the previous example, the relative positional pattern -3 backs up to the first character in the source string.

The templates in the last two examples are equivalent.

|  2  |   |var1  4 |  |  1   | |var2  2|  | 4 var3  5|  |11 var4 |
|  2  |   |var1 +2 |  | -3   | |var2 +1|  |+2 var3 +1|  |+6 var4 |
*--*--*   *---*----*  *--*---* *---*---*  *----*-----*  *---*----*
   *          *          *         *           *            *

Start     Non-        Go to 1. Non-        Go to 4       Go to 11
at 2.     inclusive   (4-3=1)  inclusive   (2+2=4).      (5+6=11).
          stopping             stopping    Non-inclusive
          point is 4           point is    stopping point
          (2+2=4).             2 (1+1=2).  is 5 (4+1=5).

You can use templates with positional patterns to make multiple assignments:

/* Making multiple assignments                                   */
books='Silas Marner, Felix Holt, Daniel Deronda, Middlemarch'
parse var books 1 Eliot 1 Evans
/* Assigns the (entire) value of books to Eliot and to Evans.    */

Combining Patterns and Parsing Into Words

What happens when a template contains patterns that divide the source string into sections containing multiple words? String and positional patterns divide the source string into substrings. The language processor then applies a section of the template to each substring, following the rules for parsing into words.

/* Combining string pattern and parsing into words               */
name='    John      Q.   Public'
parse var name fn init '.' ln        /* Assigns: fn='John'       */
                                     /*          init='     Q'   */
                                     /*          ln='   Public'  */

The pattern divides the template into two sections:

The matching pattern splits the source string into two substrings:

The language processor parses these substrings into words based on the appropriate template section.

John had three leading blanks. All are removed because parsing into words removes leading and trailing blanks except from the last variable.

Q has six leading blanks. Parsing removes one word-separator blank and keeps the rest because init is the last variable in that section of the template.

For the substring ' Public', parsing assigns the entire string into ln without removing any blanks. This is because ln is the only variable in this section of the template. (For details about treatment of blanks, see page ***.)

/* Combining positional patterns with parsing into words         */
string='R E X X'
parse var string var1 var2 4 var3 6 var4   /* Assigns: var1='R'  */
                                           /*          var2='E'  */
                                           /*          var3=' X' */
                                           /*          var4=' X' */

The pattern divides the template into three sections:

The matching patterns split the source string into three substrings that are individually parsed into words:

The variable var1 receives 'R'; var2 receives 'E'. Both var3 and var4 receive ' X' (with a blank before the X) because each is the only variable in its section of the template. (For details on treatment of blanks, see page ***.)

Parsing with Variable Patterns

You may want to specify a pattern by using the value of a variable instead of a fixed string or number. You do this by placing the name of the variable in parentheses. This is a variable reference. Blanks are not necessary inside or outside the parentheses, but you can add them if you wish.

The template in the next parsing instruction contains the following literal string pattern '. '.

parse var name fn  init '. ' ln

Here is how to specify that pattern as a variable string pattern:

strngptrn='. '
parse var name fn init (strngptrn) ln

If no equal, plus, or minus sign precedes the parenthesis that is before the variable name, the value of the variable is then treated as a string pattern. The variable can be one that has been set earlier in the same template.

Example:

/* Using a variable as a string pattern                          */
/*  The variable (delim) is set in the same template             */
SAY "Enter a date (mm/dd/yy format). =====> " /* assume 11/15/90 */
pull date
parse var date month 3 delim +1 day +2 (delim) year
            /* Sets: month='11'; delim='/'; day='15'; year='90'  */

If an equal, a plus, or a minus sign precedes the left parenthesis, then the value of the variable is treated as an absolute or relative positional pattern. The value of the variable must be a positive whole number or zero.

The variable can be one that has been set earlier in the same template. In the following example, the first two fields specify the starting character positions of the last two fields.

Example:

/* Using a variable as a positional pattern                      */
dataline = '12 26 .....Samuel ClemensMark Twain'
parse var dataline pos1 pos2 6 =(pos1) realname =(pos2) pseudonym
/* Assigns: realname='Samuel Clemens'; pseudonym='Mark Twain'    */

Why is the positional pattern 6 needed in the template? Remember that word parsing occurs after the language processor divides the source string into substrings using patterns. Therefore, the positional pattern =(pos1) cannot be correctly interpreted as =12 until after the language processor has split the string at column 6 and assigned the blank-delimited words 12 and 26 to pos1 and pos2, respectively.

Using UPPER

Specifying UPPER on any of the PARSE instructions converts characters to uppercase (lowercase a-z to uppercase A-Z) before parsing. The following table summarizes the effect of the parsing instructions on case.

Converts alphabetic characters to uppercase before parsing Maintains alphabetic characters in case entered
ARG

PARSE UPPER ARG

PARSE ARG
PARSE UPPER EXTERNAL PARSE EXTERNAL
PARSE UPPER NUMERIC PARSE NUMERIC
PULL

PARSE UPPER PULL

PARSE PULL
PARSE UPPER SOURCE PARSE SOURCE
PARSE UPPER VALUE PARSE VALUE
PARSE UPPER VAR PARSE VAR
PARSE UPPER VERSION PARSE VERSION

The ARG instruction is simply a short form of PARSE UPPER ARG. The PULL instruction is simply a short form of PARSE UPPER PULL. If you do not desire uppercase translation, use PARSE ARG (instead of ARG or PARSE UPPER ARG) and use PARSE PULL (instead of PULL or PARSE UPPER PULL).

Parsing Instructions Summary

Remember: All parsing instructions assign parts of the source string into the variables named in the template. The following table summarizes where the source string comes from.

Instruction Where the source string comes from
ARG

PARSE ARG

Arguments you list when you call the program or arguments in the call to a subroutine or function.
PARSE EXTERNAL Next line from terminal input buffer
PARSE NUMERIC Numeric control information (from NUMERIC instruction).
PULL

PARSE PULL

The string at the head of the external data queue. (If queue empty, uses default input, typically the terminal.)
PARSE SOURCE System-supplied string giving information about the executing program.
PARSE VALUE Expression between the keyword VALUE and the keyword WITH in the instruction.
PARSE VAR name Parses the value of name.
PARSE VERSION System-supplied string specifying the language, language level, and (three-word) date.

Parsing Instructions Examples

All examples in this section parse source strings into words.

ARG

/* ARG with source string named in REXX program invocation       */
/*  Program name is PALETTE.  Specify 2 primary colors (yellow,  */
/*   red, blue) on call.   Assume call is: palette red blue      */
arg var1 var2                /* Assigns: var1='RED'; var2='BLUE' */
If var1<>'RED' & var1<>'YELLOW' & var1<>'BLUE' then signal err
If var2<>'RED' & var2<>'YELLOW' & var2<>'BLUE' then signal err
total=length(var1)+length(var2)
SELECT;
  When total=7 then new='purple'
  When total=9 then new='orange'
  When total=10 then new='green'
  Otherwise new=var1                       /* entered duplicates */
END
Say new; exit                              /* Displays: "purple" */

Err:
say 'Input error--color is not "red" or "blue" or "yellow"'; exit

ARG converts alphabetic characters to uppercase before parsing. An example of ARG with the arguments in the CALL to a subroutine is in section Parsing Multiple Strings.

PARSE ARG works the same as ARG except that PARSE ARG does not convert alphabetic characters to uppercase before parsing.

PARSE EXTERNAL

Say "Enter Yes or No =====> "
parse upper external answer 2 .
If answer='Y'
  then say "You said 'Yes'!"
  else say "You said 'No'!"

PARSE NUMERIC

parse numeric digits fuzz form
say digits fuzz form           /* Displays: '9 0 SCIENTIFIC'     */
                               /* (if defaults are in effect)    */

PARSE PULL

PUSH '80 7'                /* Puts data on queue                 */
parse pull fourscore seven /* Assigns: fourscore='80'; seven='7' */
SAY fourscore+seven        /* Displays: "87"                     */

PARSE SOURCE

parse source sysname .
Say sysname                          /* Displays:       "CICS"   */

PARSE VALUE example is on page ***.

PARSE VAR examples are throughout the chapter, starting on page ***.

PARSE VERSION

parse version . level .
say level                                    /* Displays: "3.48" */

PULL works the same as PARSE PULL except that PULL converts alphabetic characters to uppercase before parsing.