Regular Expression Basic Syntax Reference

Regular Expression Basic Syntax Reference
Characters
Character Description Example
Any character All characters except the listed a matches a

except[\^$.|?*+() special characters match a
single instance of
themselves. { and } are literal
characters, unless they're part
of a valid regular expression
token (e.g. the {n} quantifier).
\ (backslash) A backslash escapes special \+ matches +

followed by any characters to suppress their
of [\^$.|?*+(){} special meaning.
\Q...\E Matches the characters \Q+-*/\E matches+-*/

between \Q and \E literally,
suppressing the meaning of
special characters.
\xFF where FF are 2 Matches the character with the \xA9 matches © when using the Latin-1
hexadecimal digits specified ASCII/ANSI value, code page.
which depends on the code
page used. Can be used in
character classes.
\n, \r and \t Match an LF character, CR \r\n matches a DOS/Windows CRLF line

character and a tab character break.
respectively. Can be used in
character classes.
\a, \e, \f and \v Match a bell character (\x07),

escape character (\x1B), form
feed (\x0C) and vertical tab
(\x0B) respectively. Can be
used in character classes.
\cA through \cZ Match an ASCII character \cM\cJ matches a DOS/Windows CRLF

Control+A through Control+Z, line break.
equivalent
to \x01 through \x1A. Can be
used in character classes.
Character Classes or Character Sets [abc]
[ (opening square Starts a character class. A

bracket) character class matches a
single character out of all the
possibilities offered by the
character class. Inside a
character class, different rules
apply. The rules in this section

are only valid inside character
classes. The rules outside this
section are not valid in
character classes, except for a
few character escapes that are
indicated with "can be used
inside character classes".
Any character All characters except the listed [abc] matches a, b orc

except^-]\ add that special characters.
character to the
possible matches for
the character class.
\ (backslash) A backslash escapes special [\^\]] matches ^ or ]

followed by any characters to suppress their
of ^-]\ special meaning.
- (hyphen) except Specifies a range of [a-zA-Z0-9] matches any letter or digit
immediately after the characters. (Specifies a
opening [ hyphen if placed immediately
after the opening [)
^ (caret) immediately Negates the character class, [^a-d] matches x (any character except

after the opening [ causing it to match a single a, b, c or d)
character not listed in the
character class. (Specifies a
caret if placed anywhere
except after the opening [)
\d, \w and \s Shorthand character classes [\d\s] matches a character that is a digit

matching digits, word or whitespace
characters (letters, digits, and
underscores), and whitespace
(spaces, tabs, and line
breaks). Can be used inside
and outside character classes.
\D, \W and \S Negated versions of the above. \D matches a character that is not a digit
Should be used only outside
character classes. (Can be
used inside, but that is
confusing.)
[\b] Inside a character class, \b is [\b\t] matches a backspace or tab

a backspace character. character
Dot
. (dot) Matches any single character . matches x or (almost) any other

except line break characters \r character
and \n. Most regex flavors
have an option to make the dot
match line break characters
too.
Anchors
^ (caret) Matches at the start of the ^. matches a inabc\ndef. Also

string the regex pattern is matches d in "multi-line" mode.
applied to. Matches a position
rather than a character. Most
regex flavors have an option to
make the caret match after line
breaks (i.e. at the start of a line
in a file) as well.
$ (dollar) Matches at the end of the .$ matches f inabc\ndef. Also

string the regex pattern is matches c in "multi-line" mode.
rather than a character. Most
regex flavors have an option to
make the dollar match before
line breaks (i.e. at the end of a
line in a file) as well. Also
matches before the very last
line break if the string ends
with a line break.
\A Matches at the start of the \A. matches a in abc

string the regex pattern is
rather than a character. Never
matches after line breaks.
\Z Matches at the end of the .\Z matches f inabc\ndef

matches before line breaks,
except for the very last line
break if the string ends with a
line break.
\z Matches at the end of the .\z matches f inabc\ndef

matches before line breaks.
Word Boundaries
\b Matches at the position .\b matches c in abc

between a word character
(anything matched by \w) and
a non-word character (anything
matched by [^\w] or \W) as
well as at the start and/or end
of the string if the first and/or
last characters in the string are
word characters.
\B Matches at the position \B.\B matches b in abc

between two word characters
(i.e the position between \w\w)
as well as at the position
between two non-word
characters (i.e. \W\W).
Alternation
| (pipe) Causes the regex engine to abc|def|xyz matchesabc, def or xyz

match either the part on the left
side, or the part on the right
side. Can be strung together
into a series of options.
| (pipe) The pipe has the lowest abc(def|xyz)matches abcdef orabcxyz

precedence of all operators.
Use grouping to alternate only
part of the regular expression.
Quantifiers
? (question mark) Makes the preceding item abc? matches ab orabc

optional. Greedy, so the
optional item is included in the
match if possible.
?? Makes the preceding item abc?? matches ab orabc

optional. Lazy, so the optional
item is excluded in the match if
possible. This construct is
often excluded from
documentation because of its
limited use.
* (star) Repeats the previous item zero ".*" matches"def" "ghi" inabc "def"

or more times. Greedy, so as "ghi" jkl
many items as possible will be
matched before trying
permutations with less
matches of the preceding item,
up to the point where the
preceding item is not matched
at all.
*? (lazy star) Repeats the previous item zero ".*?" matches "def"inabc "def"

or more times. Lazy, so the "ghi" jkl
engine first attempts to skip the
previous item, before trying
permutations with ever
increasing matches of the
preceding item.
+ (plus) Repeats the previous item ".+" matches"def" "ghi" inabc "def"

once or more. Greedy, so as "ghi" jkl
preceding item is matched only
once.
+? (lazy plus) Repeats the previous item ".+?" matches "def"inabc "def"

once or more. Lazy, so the "ghi" jkl
engine first matches the
previous item only once,
before trying permutations with
ever increasing matches of the
preceding item.
{n} where n is an Repeats the previous item a{3} matches aaa

integer >= 1 exactly n times.
{n,m} where n >= 0 Repeats the previous item a{2,4} matches aaaa,aaa or aa

and m >= n between n and m times.
Greedy, so repeating m times
is tried before reducing the
repetition to n times.
{n,m}? where n >= 0 Repeats the previous item a{2,4}? matches aa,aaa or aaaa

and m >= n between n and m times. Lazy,
so repeating n times is tried
before increasing the repetition
to m times.
{n,} where n >= 0 Repeats the previous item at a{2,} matches aaaaain aaaaa

least n times. Greedy, so as
preceding item is matched only
n times.
{n,}? where n >= 0 Repeats the previous item n or a{2,}? matches aa inaaaaa

more times. Lazy, so the
engine first matches the
previous item n times, before
trying permutations with ever
increasing matches of the
preceding item.
Regular Expression Advanced Syntax Reference
Grouping and Backreferences
Syntax Description Example
(regex) Round brackets group (abc){3}matchesabcabcabc. First group

the regex between them. matches abc.
They capture the text
matched by the regex
inside them that can be
reused in a
backreference, and they
allow you to apply regex
operators to the entire
grouped regex.
(?:regex) Non-capturing (?:abc){3}matchesabcabcabc. No

parentheses group the groups.
regex so you can apply
regex operators, but do
not capture anything and
do not create
backreferences.
\1 through \9 Substituted with the text (abc|def)=\1matchesabc=abc o

matched between the 1st rdef=def, but not abc=def ordef=abc.
through 9th pair of
capturing parentheses.
Some regex flavors allow
more than 9
backreferences.
Modifiers
(?i) Turn on case insensitivity te(?i)stmatches teSTbut not TEST.

for the remainder of the
regular expression.
(Older regex flavors may
turn it on for the entire
regex.)
(?-i) Turn off case insensitivity (?i)te(?-i)stmatches TEstbut

for the remainder of the not TEST.
regular expression.
(?s) Turn on "dot matches

newline" for the
remainder of the regular
expression. (Older regex
flavors may turn it on for
the entire regex.)
(?-s) Turn off "dot matches

newline" for the
expression.
(?m) Caret and dollar match
after and before newlines
regular expression.
apply this to the entire
regex.)
(?-m) Caret and dollar only

match at the start and
end of the string for the
expression.
(?x) Turn on free-spacing

mode to ignore
whitespace between
regex tokens, and allow
# comments.
(?-x) Turn off free-spacing

mode.
(?i-sm) Turns on the options "i"

and "m", and turns off "s"
regular expression.
apply this to the entire
regex.)
(?i-sm:regex) Matches the regex inside (?i:te)stmatches TEstbut not TEST.

the span with the options
"i" and "m" turned on,
and "s" turned off.
Atomic Grouping and Possessive Quantifiers

(?>regex) Atomic groups prevent x(?>\w+)x is more efficient than x\w+x if

the regex engine from the second x cannot be matched.
backtracking back into
the group (forcing the
group to discard part of
its match) after a match
has been found for the
group. Backtracking can
occur inside the group
before it has matched
completely, and the
engine can backtrack
past the entire group,
discarding its match
entirely. Eliminating
needless backtracking
provides a speed
increase. Atomic
grouping is often
indispensable when
nesting quantifiers to
prevent a catastrophic
amount of backtracking
as the engine needlessly
tries pointless
permutations of the
nested quantifiers.
?+, *+, ++ and{m,n}+ Possessive quantifiers x++ is identical to (?>x+)

are a limited yet
syntactically cleaner
alternative to atomic
grouping. Only available
in a few regex flavors.
They behave as normal
greedy quantifiers,
except that they will not
give up part of their
match for backtracking.
Lookaround
(?=regex) Zero-width positive t(?=s)matches the second t instreets.

lookahead. Matches at a
position where the
pattern inside the
lookahead can be
matched. Matches only
the position. It does not
consume any characters
or expand the match. In
a pattern likeone(?
=two)three,
both two and three hav
e to match at the position
where the match
of one ends.
(?!regex) Zero-width negative t(?!s)matches the firstt in streets.

lookahead. Identical to
positive lookahead,
except that the overall
match will only succeed if
the regex inside the
lookahead fails to match.
(?<=text) Zero-width positive (?<=s)tmatches the firstt in streets.

lookbehind. Matches at a
position to the left of
which text appears.
Since regular
expressions cannot be
applied backwards, the
test inside the lookbehind
can only be plain text.
Some regex flavors allow
alternation of plain text
options in the
lookbehind.
(?<!text) Zero-width negative (?<!s)tmatches the second t instreets.

lookbehind. Matches at a
position if the text does
not appear to the left of
that position.
Continuing from The Previous Match
\G Matches at the position \G[a-z] first matches a, then

where the previous matches b and then fails to match in ab_cd.
match ended, or the
position where the
current match attempt
started (depending on
the tool or regex flavor).
Matches at the start of
the string during the first
match attempt.
Conditionals
(?(?=regex)then| If the lookahead (?(?<=a)b|c)matches the second b and

else) succeeds, the "then" part the first c inbabxcac
must match for the
overall regex to match. If
the lookahead fails, the
"else" part must match
for the overall regex to
match. Not just positive
lookahead, but all four
lookarounds can be
used. Note that the
lookahead is zero-width,
so the "then" and "else"
parts need to match and
consume the part of the
text matched by the
lookahead as well.
(?(1)then|else) If the first capturing (a)?(?(1)b|c)matches ab, the first c and

group took part in the the second c inbabxcac
match attempt thus far,
the "then" part must
match for the overall
regex to match. If the first
capturing group did not
take part in the match,
the "else" part must
match for the overall
regex to match.
Comments
(?#comment) Everything between (? a(?#foobar)bmatches a

# and ) is ignored by the
regex engine.
printf - a quick look at Perl and Java
In this cheat sheet I'm going to show all the examples using Perl, but I thought at first it
might help to one printf example using both Perl and Java. So, here's a simple Perl printf
example to get us started:
printf("the %s jumped over the %s, %d times", "cow", "moon", 2);
And here are three different ways of using printf format specifier syntax with Java:
System.out.format("the %s jumped over the %s, %d times", "cow",

"moon", 2);
System.err.format("the %s jumped over the %s, %d times", "cow",
"moon", 2);
String result = String.format("the %s jumped over the %s, %d
times", "cow", "moon", 2);
As you can see in that last String.format example, that line of code doesn't print any
output, while the first line prints to standard output, and the second line prints to standard
error.
In the remainder of this document I'm going to use Perl examples, but again, the actual
format specifier strings can be used in many different languages.
A summary of the printf format specifiers

Here's a quick summary of the available print format specifiers:
%c character
%d decimal (integer) number (base 10)
%e exponential floating-point number
%f floating-point number
%i integer (base 10)
%o octal number (base 8)
%s a string of characters
%u unsigned decimal (integer) number
%x number in hexadecimal (base 16)
%% print a percent sign
\% print a percent sign
Controlling printf integer width
The "%3d" specifier means a minimum width of three spaces, which, by default, will be
right-justified. (Note: the alignment is not currently being displayed properly here.)
printf("%3d", 0); 0
printf("%3d", 123456789); 123456789
printf("%3d", -10); -10
printf("%3d", -123456789); -123456789

Left-justifying printf integer output
To left-justify those previous printf examples, just add a minus sign ( -) after the % symbol,
like this:
printf("%-3d", 0); 0
printf("%-3d", 123456789); 123456789
printf("%-3d", -10); -10
printf("%-3d", -123456789); -123456789
The printf zero-fill option

To zero-fill your integer output, just add a zero ( 0) after the % symbol, like this:
printf("%03d", 0); 000
printf("%03d", 1); 001
printf("%03d", 123456789); 123456789
printf("%03d", -10); -10
printf("%03d", -123456789); -123456789
printf - integers with formatting
Here is a collection of examples for integer printing. Several different options are shown,
including a minimum width specification, left-justified, zero-filled, and also a plus sign for
positive numbers.
Description Code Result

At least five wide printf("'%5d'", 10); ' 10'
At least five-wide, left-justified printf("'%-5d'", 10); '10 '
At least five-wide, zero-filled printf("'%05d'", 10); '00010'
At least five-wide, with a plus sign printf("'%+5d'", 10); ' +10'
Five-wide, plus sign, left-justified printf("'%-+5d'", 10); '+10 '
Print one position after the decimal printf("'%.1f'", 10.3456); '10.3'
Two positions after the decimal printf("'%.2f'", 10.3456); '10.35'
Eight-wide, two positions after the decimal printf("'%8.2f'", 10.3456); ' 10.35'
Eight-wide, four positions after the decimal printf("'%8.4f'", 10.3456); ' 10.3456'
Eight-wide, two positions after the decimal,

printf("'%08.2f'", 10.3456); '00010.35'
zero-filled
Eight-wide, two positions after the decimal,

printf("'%-8.2f'", 10.3456); '10.35 '
left-justified
Printing a much larger number with that printf("'%-8.2f'",
'101234567.35'
same format 101234567.3456);
How to print strings with printf formatting

Here are several printf formatting examples that show how to format string output
with printf format specifiers.
A simple string printf("'%s'", "Hello"); 'Hello'
A string with a minimum length printf("'%10s'", "Hello"); ' Hello'
Minimum length, left-justified printf("'%-10s'", "Hello"); 'Hello '
Summary of special printf characters

The following character sequences have a special meaning when used as printf format
specifiers:
\a audible alert
\b backspace
\f form feed
\n newline, or linefeed
\r carriage return
\t tab
\v vertical tab
\\ backslash
As you can see from that last example, because the backslash character itself is treated
specially, you have to print two backslash characters in a row to get one backslash
character to appear in your output.
Here are a few examples of how to use this special characters:
Insert a tab character in a string printf("Hello\tworld"); Hello world
Insert a newline character in a Hello

printf("Hello\nworld");
string world
Typical use of the newline

printf("Hello world\n"); Hello world
character
A DOS/Windows path with printf("C:\\Windows\\System32\\")

C:\Windows\System32\
backslash characters ;
Algorithms: Big-Oh Notation
How time and space grow as the amount of data increases
It's useful to estimate the cpu or memory resources an algorithm requires. This "complexity analysis"
attempts to characterize the relationship between the number of data elements and resource usage (time or
space) with a simple formula approximation. Many programmers have had ugly surprises when they moved
from small test data to large data sets. This analysis will make you aware of potential problems.
Dominant Term
Big-Oh (the "O" stands for "order of") notation is concerned with what happens for very large values of N,
therefore only the largest term in a polynomial is needed. All smaller terms are dropped.
For example, the number of operations in some sorts is N 2 - N. For large values of N, the single N term is
insignificant compared to N2, therefore one of these sorts would be described as an O(N 2) algorithm.
Similarly, constant multipliers are ignored. So a O(4*N) algorithm is equivalent to O(N), which is how it
should be written. Ultimately you want to pay attention to these multipliers in determining the performance,
but for the first round of analysis using Big-Oh, you simply ignore constant factors.
Why Size Matters
Here is a table of typical cases, showing how many "operations" would be performed for various values of N.
Logarithms to base 2 (as used here) are proportional to logarithms in other base, so this doesn't affect the
big-oh formula.
constant logarithmic linear quadratic cubic
n O(1) O(log N) O(N) O(N log N) O(N2) O(N3)
1 1 1 1 1 1 1
2 1 1 2 2 4 8
4 1 2 4 8 16 64
8 1 3 8 24 64 512
16 1 4 16 64 256 4,096
1,024 1 10 1,024 10,240 1,048,576 1,073,741,824
1,048,576 1 20 1,048,576 20,971,520 1012 1016

Does anyone really have that much data?
It's quite common. For example, it's hard to find a digital camera that that has fewer than a million pixels (1
mega-pixel). These images are processed and displayed on the screen. The algorithms that do this had
better not be O(N2)! If it took one microsecond (1 millionth of a second) to process each pixel, an O(N 2)
algorithm would take more than a week to finish processing a 1 megapixel image, and more than three
months to process a 3 megapixel image (note the rate of increase is definitely not linear).
Another example is sound. CD audio samples are 16 bits, sampled 44,100 times per second for each of two
channels. A typical 3 minute song consists of about 8 million data points. You had better choose the write
algorithm to process this data.
A dictionary I've used for text analysis has about 125,000 entries. There's a big difference between a linear
O(N), binary O(log N), or hash O(1) search.
Best, worst, and average cases
You should be clear about which cases big-oh notation describes. By default it usually refers to the average
case, using random data. However, the characteristics for best, worst, and average cases can be very
different, and the use of non-random data (often more realistic) data can have a big effect on some
algorithms.
Why big-oh notation isn't always useful
Complexity analysis can be very useful, but there are problems with it too.
 Too hard to analyze. Many algorithms are simply too hard to analyze mathematically.
 Average case unknown. There may not be sufficient information to know what the most
important "average" case really is, therefore analysis is impossible.
 Unknown constant. Both walking and traveling at the speed of light have a time-as-
function-of-distance big-oh complexity of O(N). Altho they have the same big-oh
characteristics, one is rather faster than the other. Big-oh analysis only tells you how it
grows with the size of the problem, not how efficient it is.
 Small data sets. If there are no large amounts of data, algorithm efficiency may not be
important.
Benchmarks are better
Big-oh notation can give very good ideas about performance for large amounts of data, but the only real
way to know for sure is to actually try it with large data sets. There may be performance issues that are not
taken into account by big-oh notation, eg, the effect on paging as virtual memory usage grows. Although
benchmarks are better, they aren't feasible during the design process, so Big-Oh complexity analysis is the
choice.
Typical big-oh values for common algorithms
Searching
Here is a table of typical cases.
Type of Search Big-Oh Comments

Linear search array/ArrayList/LinkedList O(N)
Binary search sorted array/ArrayList O(log N) Requires sorted data.
Search balanced tree O(log N)
Search hash table O(1)
Other Typical Operations
array
Algorithm LinkedList
ArrayList
access front O(1) O(1)
access back O(1) O(1)
access middle O(1) O(N)
insert at front O(N) O(1)
insert at back O(1) O(1)
insert in middle O(N) O(1)
Sorting arrays/ArrayLists
Some sorting algorithms show variability in their Big-Oh performance. It is therefore interesting to look at
their best, worst, and average performance. For this description "average" is applied to uniformly distributed
values. The distribution of real values for any given application may be important in selecting a particular
algorithm.
Type of Sort Best Worst Average Comments
BubbleSort O(N) O(N )2

O(N )2
Not a good sort, except with ideal data.
Selection O(N2) O(N2) O(N2) Perhaps best of O(N2) sorts

sort
QuickSort O(N log N) O(N2) O(N log Good, but it worst case is O(N2)
N)
HeapSort O(N log N) O(N log N) O(N log Typically slower than QuickSort, but worst
N) case is much better.
Example - choosing a non-optimal algorithm

I had to sort a large array of numbers. The values were almost always already in order, and even when they
weren't in order there was typically only one number that was out of order. Only rarely were the values
completely disorganized. I used a bubble sort because it was O(1) for my "average" data. This was many
years ago when CPUs were 1000 times slower. Today I would simply use the library sort for the amount of
data I had because the difference in execution time would probably be unnoticed. However, there are always
data sets which are so large that a choice of algorithms really matters.
Example - O(N3) surprise
I once wrote a text-processing program to solve some particular customer problem. After seeing how well it
processed the test data, the customer produced real data, which I confidently ran the program on. The
program froze -- the problem was that I had inadvertently used an O(N 3) algorithm and there was no way it
was going to finish in my lifetime. Fortunately, my reputation was restored when I was able to rewrite the
offending algorithm within an hour and process the real data in under a minute. Still, it was a sobering
experience, illustrating dangers in ignoring complexity analysis, using unrealistic test data, and giving
customer demos.
Same Big-Oh, but big differences
Altho two algorithms have the same big-oh characteristics, they may differ by a factor of three (or more) in
practical implementations. Remember that big-oh notation ignores constant overhead and constant factors.
These can be substantial and can't be ignored in practical implementations.
Time-space tradeoffs
Sometimes it's possible to reduce execution time by using more space, or reduce space requirements by
using a more time-intensive algorithm.

Regular Expression Basic Syntax Reference

Enviado por

Dados do documento

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Regular Expression Basic Syntax Reference

Enviado por

Direitos autorais:

Formatos disponíveis

Regular Expression Basic Syntax Reference

Character Description Example

Any character All characters except the listed a matches a

\ (backslash) A backslash escapes special \+ matches +

\Q...\E Matches the characters \Q+-*/\E matches+-*/

\n, \r and \t Match an LF character, CR \r\n matches a DOS/Windows CRLF line

\a, \e, \f and \v Match a bell character (\x07),

\cA through \cZ Match an ASCII character \cM\cJ matches a DOS/Windows CRLF

Character Classes or Character Sets [abc]

Character Description Example

[ (opening square Starts a character class. A

Any character All characters except the listed [abc] matches a, b orc

\ (backslash) A backslash escapes special [\^\]] matches ^ or ]

^ (caret) immediately Negates the character class, [^a-d] matches x (any character except

\d, \w and \s Shorthand character classes [\d\s] matches a character that is a digit

[\b] Inside a character class, \b is [\b\t] matches a backspace or tab

Character Description Example

. (dot) Matches any single character . matches x or (almost) any other

Character Description Example

^ (caret) Matches at the start of the ^. matches a inabc\ndef. Also

$ (dollar) Matches at the end of the .$ matches f inabc\ndef. Also

\A Matches at the start of the \A. matches a in abc

\Z Matches at the end of the .\Z matches f inabc\ndef

\z Matches at the end of the .\z matches f inabc\ndef

Character Description Example

\b Matches at the position .\b matches c in abc

\B Matches at the position \B.\B matches b in abc

Character Description Example

| (pipe) Causes the regex engine to abc|def|xyz matchesabc, def or xyz

| (pipe) The pipe has the lowest abc(def|xyz)matches abcdef orabcxyz

Character Description Example

? (question mark) Makes the preceding item abc? matches ab orabc

?? Makes the preceding item abc?? matches ab orabc

* (star) Repeats the previous item zero ".*" matches"def" "ghi" inabc "def"

*? (lazy star) Repeats the previous item zero ".*?" matches "def"inabc "def"

+ (plus) Repeats the previous item ".+" matches"def" "ghi" inabc "def"

+? (lazy plus) Repeats the previous item ".+?" matches "def"inabc "def"

{n} where n is an Repeats the previous item a{3} matches aaa

{n,m} where n >= 0 Repeats the previous item a{2,4} matches aaaa,aaa or aa

{n,m}? where n >= 0 Repeats the previous item a{2,4}? matches aa,aaa or aaaa

{n,} where n >= 0 Repeats the previous item at a{2,} matches aaaaain aaaaa

{n,}? where n >= 0 Repeats the previous item n or a{2,}? matches aa inaaaaa

Regular Expression Advanced Syntax Reference

Grouping and Backreferences

Syntax Description Example

(regex) Round brackets group (abc){3}matchesabcabcabc. First group

(?:regex) Non-capturing (?:abc){3}matchesabcabcabc. No

\1 through \9 Substituted with the text (abc|def)=\1matchesabc=abc o

Syntax Description Example

(?i) Turn on case insensitivity te(?i)stmatches teSTbut not TEST.

(?-i) Turn off case insensitivity (?i)te(?-i)stmatches TEstbut

(?s) Turn on "dot matches

(?-s) Turn off "dot matches

(?-m) Caret and dollar only

(?x) Turn on free-spacing

(?-x) Turn off free-spacing

(?i-sm) Turns on the options "i"

(?i-sm:regex) Matches the regex inside (?i:te)stmatches TEstbut not TEST.

Atomic Grouping and Possessive Quantifiers

(?>regex) Atomic groups prevent x(?>\w+)x is more efficient than x\w+x if

?+, *+, ++ and{m,n}+ Possessive quantifiers x++ is identical to (?>x+)

Syntax Description Example

(?=regex) Zero-width positive t(?=s)matches the second t instreets.

(?!regex) Zero-width negative t(?!s)matches the firstt in streets.

\Q...\E Matches the characters \Q+-/\E matches+-/

? (lazy star) Repeats the previous item zero ".?" matches "def"inabc "def"