The following directives control the special characters that appear in regular expressions:
| Directive | Use | Default Value |
|---|---|---|
.REGEX_CHAR | Sets the escape character used to indicate special sequences | \ (backslash) |
.REGEX_WILD | Sets the wildcard character that matches any single character | . (period) |
Table 29 lists the forms, or components, of a regular expression.
Label | Form | Form Description | |
|---|---|---|---|
[1] | character | A normal character matches itself. Special characters: wild \ [ ] * + ^ $ | |
[2] | wild | The wildcard character, wild, matches any character. | |
[3] | \ | The escape character makes special characters literal: \wild is literal wild, \\ is literal \, \[ is literal [, and so on. The only exceptions are that \( and \) are special. See [7] below. | |
[4] | [ set ] | Matches one character in set. If the first character in set is ^, this form matches characters not in set. A range start-end means characters from start to end. The characters ] and - aren't special if they appear as the first characters in set. \t matches a tab character. For example: | |
Set [a-zA-Z@] [^]-] [^A-Z] [<SPACE> \t] | Matches Lowercase and uppercase alphabetic, or @ Neither ] nor - Not uppercase alphabetic A space or tab (that is, white space) | ||
[5] | form* | Any regular expression form labeled [1] to [4] followed by the closure character * matches zero or more of the form. | |
[6] | form+ | + is like *, except it matches one or more of the form. | |
[7] | \( form \) | A regular expression in the form [1] to [10], enclosed as \( form \) matches what form matches. The substring matched by form can be referenced with a tag (see below). | |
[8] | \1 ... \9 | Matches a previously tagged regular expression (see [7]). | |
[9] | form1form2 | A composite regular expression form1form2, where form1 and form2 are in the form [1] to [9], matches the longest match of form1 followed by a match of form2. | |
[10] | ^ and $ | A regex starting with ^ and/or ending with $ restricts the regex to the beginning of the line and/or the end of line. Elsewhere in the regex, ^ and $ are ordinary characters. | |
Once a regular expression has matched, you can reference the matched part:
& refers to the entire matched string (but \& is a literal ampersand).
If \( \) is used to delimit a part of the regular expression, the tag \1 refers to the first delimited part of the matched substring. Successive pairs of \( \) are tagged \2, \3, ..., \9.
|
Feedback on the documentation in this site? We welcome any comments!
Copyright © 2001 by Rational Software Corporation. All rights reserved. |