The following directives control the special characters that appear in regular expressions:
Directive | Use | Default Value |
---|---|---|
.REGEX_CHAR | Sets the escape character used to indicate special sequences | \ (backslash) |
.REGEX_WILD | Sets the wildcard character that matches any single character | . (period) |
Table 29 lists the forms, or components, of a regular expression.
Label | Form | Form Description | |
---|---|---|---|
[1] | character | A normal character matches itself. Special characters: wild \ [ ] * + ^ $ | |
[2] | wild | The wildcard character, wild, matches any character. | |
[3] | \ | The escape character makes special characters literal: \wild is literal wild, \\ is literal \, \[ is literal [, and so on. The only exceptions are that \( and \) are special. See [7] below. | |
[4] | [ set ] | Matches one character in set. If the first character in set is ^, this form matches characters not in set. A range start-end means characters from start to end. The characters ] and - aren't special if they appear as the first characters in set. \t matches a tab character. For example: | |
Set [a-zA-Z@] [^]-] [^A-Z] [<SPACE> \t] | Matches Lowercase and uppercase alphabetic, or @ Neither ] nor - Not uppercase alphabetic A space or tab (that is, white space) | ||
[5] | form* | Any regular expression form labeled [1] to [4] followed by the closure character * matches zero or more of the form. | |
[6] | form+ | + is like *, except it matches one or more of the form. | |
[7] | \( form \) | A regular expression in the form [1] to [10], enclosed as \( form \) matches what form matches. The substring matched by form can be referenced with a tag (see below). | |
[8] | \1 ... \9 | Matches a previously tagged regular expression (see [7]). | |
[9] | form1form2 | A composite regular expression form1form2, where form1 and form2 are in the form [1] to [9], matches the longest match of form1 followed by a match of form2. | |
[10] | ^ and $ | A regex starting with ^ and/or ending with $ restricts the regex to the beginning of the line and/or the end of line. Elsewhere in the regex, ^ and $ are ordinary characters. |
Once a regular expression has matched, you can reference the matched part:
& refers to the entire matched string (but \& is a literal ampersand).
If \( \) is used to delimit a part of the regular expression, the tag \1 refers to the first delimited part of the matched substring. Successive pairs of \( \) are tagged \2, \3, ..., \9.
Feedback on the documentation in this site? We welcome any comments!
Copyright © 2001 by Rational Software Corporation. All rights reserved. |