E.1 Configuration of Regular Expressions

The following directives control the special characters that appear in regular expressions:

Directive Use Default Value

.REGEX_CHAR

Sets the escape character used to indicate special sequences

\ (backslash)

.REGEX_WILD

Sets the wildcard character that matches any single character

. (period)

Regular Expression Components

Table 29 lists the forms, or components, of a regular expression.

Table 29 Regular Expression Components


Label

Form

Form Description

[1]


character


A normal character matches itself. Special characters: wild \ [ ] * + ^ $



[2]


wild


The wildcard character, wild, matches any character.



[3]


\


The escape character makes special characters literal: \wild is literal wild, \\ is literal \, \[ is literal [, and so on. The only exceptions are that \( and \) are special. See [7] below.



[4]



[ set ]



Matches one character in set. If the first character in set is ^, this form matches characters not in set. A range start-end means characters from start to end. The characters ] and - aren't special if they appear as the first characters in set. \t matches a tab character. For example:



Set

[a-zA-Z@]

[^]-]

[^A-Z]

[<SPACE> \t]


Matches

Lowercase and uppercase alphabetic, or @

Neither ] nor -

Not uppercase alphabetic

A space or tab (that is, white space)


[5]


form*


Any regular expression form labeled [1] to [4] followed by the closure character * matches zero or more of the form.



[6]


form+


+ is like *, except it matches one or more of the form.



[7]


\( form \)


A regular expression in the form [1] to [10], enclosed as \( form \) matches what form matches. The substring matched by form can be referenced with a tag (see below).



[8]


\1 ... \9


Matches a previously tagged regular expression (see [7]).



[9]


form1form2


A composite regular expression form1form2, where form1 and form2 are in the form [1] to [9], matches the longest match of form1 followed by a match of form2.



[10]


^ and $


A regex starting with ^ and/or ending with $ restricts the regex to the beginning of the line and/or the end of line. Elsewhere in the regex, ^ and $ are ordinary characters.


Referencing the Matched Expression

Once a regular expression has matched, you can reference the matched part: