Appendix H Regular Expressions

A regular expression is a pattern of text that consists of ordinary characters (such as letters a through z) and special characters that are known as metacharacters . The pattern is used to describe one or more strings to match when searching a body of text. The regular expression acts as a template for matching a character pattern to the string that is being searched for.

The following table contains the complete list of metacharacters and their behavior in the context of a regular expression.

Character Description

Marks the next character as either a special character or a literal

^

Matches the beginning of input

$

Matches the end of input

*

Matches the preceding character zero or more times

+

Matches the preceding character one or more times

?

Matches the preceding character zero or one time

.

Matches any single character except a newline character

(pattern)

Matches pattern and remembers the match. The matched substring can be retrieved from the resulting Matches collection, using Item [0]...[n] To match the parentheses characters themselves , precede with slash-use ³ ³ or ³ ( ³)

(?:pattern)

Matches pattern but does not capture the match, that is, it is a noncapturing match that is not stored for possible later use. This is useful for combining parts of a pattern with the ³or ³ character (!). For example, ² anomol (?:y ! ies) ² is a more economical expression than ²anomoly ! anomolies ²

(?=pattern)

Positive lookahead matches the search string at any point where a string matching pattern begins. This is a noncapturing match, that is, the match is not captured for possible later use. For example ²Windows (?= 95 98 NT 2000 XP) ² matches ³Windows ³ in ³Windows XP ³ but not ³Windows ³ in ³Windows 3.1 ³

(?!pattern)

Negative lookahead matches the search string at any point where a string not matching pattern begins. This is a noncapturing match, that is, the match is not captured for possible later use. For example, ³Windows ( ? ! 95 98 NT 2000 XP ) ³ matches ³Windows ³ in ³Windows 3.1 ³ but does not match ³Windows ³ in ³Windows XP ³

xy

Matches either x or y

{n }

Matches exactly n times n must always be a nonnegative integer)

{n, }

Matches at least n times ( n must always be a nonnegative integer-note the terminating comma)

{n,m}

Matches at least n and at most m times ( m and n must always be nonnegative integers)

[xyz]

Matches any one of the enclosed characters ( xyz represents a character set)

[^xyz]

Matches any character not enclosed ( ^xyz represents a negative character set)

[a-z]

Matches any character in the specified range ( a-z represents a range of characters)

[^m-z]

Matches any character not in the specified range ( ^m-z represents a negative range of characters)



Matches a word boundary, that is, the position between a word and a space

B

Matches a nonword boundary

d

Matches a digit character. Equivalent to [0 - 9]

D

Matches a nondigit character. Equivalent to [^ 0-9]

f

Matches a form-feed character

Matches a newline character

Matches a carriage return character

s

Matches any white space including space, tab, form-feed, and so on. Equivalent to ' [f v]"

Matches a tab character ' [^f v]

v

Matches a vertical tab character

w

Matches any word character including underscore . Equivalent to ³ [A-Za-z0-9_] ³

W

Matches any nonword character. Equivalent to ³ [^A-Za-z0-9_] ³

.

Matches .

Matches

{

Matches {

}

Matches }

\

Matches

[

Matches [

]

Matches ]

(

Matches (

)

Matches )

$ num

Matches num, where num is a positive integer. A reference back to remembered matches (note the $ symbol- differs from some Microsoft documentation)

Matches n , where n is an octal escape value. Octal escape values must be 1, 2, or 3 digits long

uxxxx

Matches the ASCII character expressed by the UNICODE xxxx

xn

Matches n , where n is a hexadecimal escape value. Hexadecimal escape values must be exactly two digits long

Категории