例子1: 匹配字符 abcde
ab.* # match
abcd # no match
例子2: 匹配字符 abcde
ab... # match
a.c.e # match
匹配除换行符 \n 之外的任何单字符
例子3: 匹配字符 aaabbb
a+b+ # match
aa+bb+ # match
a+.+ # match
aa+bbb+ # match
例子4: 匹配字符 aaabbb
a*b* # match
a*b*c* # match
.*bbb.* # match
aaa*bbb* # match
例子5: 匹配字符 aaabbb
aaa?bbb? # match
aaaa?bbbb? # match
.....?.? # match
aa?bb? # no match
Curly brackets "{}"
can be used to specify a minimum and (optionally) a maximum number of times the preceding shortest pattern can repeat. The allowed forms are:
{5} # repeat exactly 5 times
{2,5} # repeat at least twice and at most 5 times
{2,} # repeat at least twice
For string "aaabbb"
a{3}b{3} # match
a{2,4}b{2,4} # match
a{2,}b{2,} # match
.{3}.{3} # match
a{4}b{4} # no match
a{4,6}b{4,6} # no match
a{4,}b{4,} # no match
Parentheses "()"
can be used to form sub-patterns. The quantity operators listed above operate on the shortest previous pattern, which can be a group. For string "ababab"
(ab)+ # match
ab(ab)+ # match
(..)+ # match
(...)+ # no match
(ab)* # match
abab(ab)? # match
ab(ab)? # no match
(ab){3} # match
(ab){1,2} # no match
The pipe symbol "|"
acts as an OR operator. The match will succeed if the pattern on either the left-hand side OR the right-hand side matches. The alternation applies to the longest pattern, not the shortest. For string "aabb"
aabb|bbaa # match
aacc|bb # no match
aa(cc|bb) # match
a+|b+ # no match
a+b+|b+a+ # match
a+(b|c)+ # match
Character classes
Ranges of potential characters may be represented as character classes by enclosing them in square brackets "[]"
. A leading ^
negates the character class. The allowed forms are:
[abc] # 'a' or 'b' or 'c'
[a-c] # 'a' or 'b' or 'c'
[-abc] # '-' or 'a' or 'b' or 'c'
[abc\-] # '-' or 'a' or 'b' or 'c'
[^abc] # any character except 'a' or 'b' or 'c'
[^a-c] # any character except 'a' or 'b' or 'c'
[^-abc] # any character except '-' or 'a' or 'b' or 'c'
[^abc\-] # any character except '-' or 'a' or 'b' or 'c'
Note that the dash "-"
indicates a range of characters, unless it is the first character or if it is escaped with a backslash.
For string "abcd"
ab[cd]+ # match
[a-d]+ # match
[^a-d]+ # no match
The complement is probably the most useful option. The shortest pattern that follows a tilde "~"
is negated. For instance, `"ab~cd" means:
- Starts with
- Followed by
- Followed by a string of any length that it anything but
- Ends with
For the string "abcdef"
ab~df # match
ab~cf # match
ab~cdef # no match
a~(cb)def # match
a~(bc)def # no match
Enabled with the COMPLEMENT
or ALL
The interval option enables the use of numeric ranges, enclosed by angle brackets "<>"
. For string: "foo80"
foo<1-100> # match
foo<01-100> # match
foo<001-100> # no match
Enabled with the INTERVAL
or ALL
The ampersand "&"
joins two patterns in a way that both of them have to match. For string "aaabbb"
aaa.+&.+bbb # match
aaa&bbb # no match
Using this feature usually means that you should rewrite your regular expression.
Enabled with the INTERSECTION
or ALL
Any string
The at sign "@ "
matches any string in its entirety. This could be combined with the intersection and complement above to express “everything except”. For instance:
@ &~(foo.+) # anything except string beginning with "foo"
Enabled with the ANYSTRING or ALL flags.