Vol. I ยท A Field Guide For the working programmer Edition MMXXVI

The Regex
Cheatยทsheet

Patterns, Anchors & Escapes โ€” at a glance

Verified
/.*?/ Pattern ยท Match
ยง Prefatory
Note

A regular expression is a tiny language for describing the shape of text. This sheet collects the pieces you will reach for most often โ€” what to match, how much of it, and where. Examples assume the common /pattern/flags notation; small dialect differences exist between engines.

Contents

ยง 01 ยท Chapter I

Character Classes

What kind of character to match. The vocabulary of single positions.

.
Any character
Matches any single character except a newline (unless the s flag is set).
exc.t โ†’ cat, cot, c7t
\d
Digit
A single digit, equivalent to [0-9].
ex\d\d โ†’ age 42
\D
Non-digit
Any character that is not a digit.
ex\D+ โ†’ price: $42
\w
Word character
A letter, digit, or underscore โ€” equivalent to [A-Za-z0-9_].
ex\w+ โ†’ hello_world
\W
Non-word character
Anything other than a letter, digit, or underscore.
ex\W โ†’ hi!
\s
Whitespace
A space, tab, newline, or other whitespace character.
exa\sb โ†’ a b
\S
Non-whitespace
Any character that isn't whitespace.
ex\S+ โ†’ hello world
[abc]
Character set
Any one of the listed characters.
ex[aeiou] โ†’ cat
[^abc]
Negated set
Any character not in the list. The ^ inverts the class.
ex[^0-9] โ†’ abc!
[a-z]
Character range
A range of characters by code point. Combine ranges freely.
ex[A-Za-z0-9] โ†’ Hello42
ยง 02 ยท Chapter II

Quantifiers

How many times the preceding piece should occur. Greedy by default; add ? to make them lazy.

*
Zero or more
Matches the previous item any number of times, including none.
exab* โ†’ a, ab, abbbb
+
One or more
Matches the previous item at least once.
ex\d+ โ†’ page 42
?
Optional
Zero or one โ€” makes the preceding piece optional.
excolou?r โ†’ color, colour
{n}
Exactly n
Exactly n occurrences of the preceding item.
ex\d{4} โ†’ year 2026
{n,}
At least n
n or more occurrences.
ex\d{3,} โ†’ 12345
{n,m}
Between n and m
A range of repetitions, inclusive on both ends.
ex\d{2,4} โ†’ 42, 2026
*? +?
Lazy (non-greedy)
Adding ? to a quantifier makes it match as little as possible.
ex<.*?> โ†’ <b>hi</b>
*+ ++
Possessive
Matches greedily and never backtracks. Not supported in all engines (e.g. JavaScript).
ex\d++ โ†’ matches digits, won't release
ยง 03 ยท Chapter III

Anchors & Boundaries

Anchors match positions, not characters. They specify where in the string a match must occur.

^
Start of string
Matches the beginning of the input. With the m flag, matches the start of each line.
ex^Hello โ†’ Hello world
$
End of string
Matches the end of the input (or end of each line with m).
exend$ โ†’ the end
\b
Word boundary
Position between a word character and a non-word character. No characters consumed.
ex\bcat\b โ†’ cat, not category
\B
Not a word boundary
The opposite of \b โ€” inside a word, or between two non-word characters.
ex\Bcat โ†’ concatenate
\A
Absolute start
Start of input only โ€” never affected by the m flag. Not in JavaScript.
ex\Aabc โ†’ abcdef
\z
Absolute end
End of input only. Many engines also accept \Z (end, but allows trailing newline).
exend\z โ†’ the end
ยง 04 ยท Chapter IV

Groups & Alternation

Group pieces, choose between alternatives, and refer back to captured text.

(abc)
Capturing group
Groups expressions and saves the matched text for later reference.
ex(\d{3})-\d{4} โ†’ 555-1234
(?:abc)
Non-capturing group
Groups without saving โ€” useful when you only need structure, not the capture.
ex(?:ab)+ โ†’ ababab
a|b
Alternation
Match either the left or right alternative. Lowest precedence โ€” usually wrapped in a group.
excat|dog โ†’ cat, dog
(?<n>..)
Named group
Capture under a name, accessed via match.groups.n or similar.
ex(?<year>\d{4}) โ†’ 2026
\1 \2
Backreference
Reuses the text matched by an earlier capturing group, by number.
ex(\w+) \1 โ†’ the the
\k<n>
Named backreference
Reuses a named capture by name rather than position.
ex(?<c>\w)\k<c> โ†’ ll in hello
ยง 05 ยท Chapter V

Escape Sequences

The backslash has two jobs: strip special meaning from metacharacters, and grant it to ordinary letters.

\.
Literal period
A real . rather than the "any character" metacharacter.
ex\d+\.\d+ โ†’ 3.14
\\
Literal backslash
A single backslash character in the input.
exC:\\\\ โ†’ C:\
\( \)
Literal parens
Match real parentheses instead of starting a group.
ex\(\d+\) โ†’ (42)
\[ \]
Literal brackets
Match real square brackets.
ex\[\w+\] โ†’ [note]
\? \* \+
Literal quantifiers
Escape these to match them as ordinary punctuation.
exare\? โ†’ are?
\|
Literal pipe
A vertical bar character, not alternation.
exa\|b โ†’ a|b
\n
Newline
Line-feed character (ASCII 10).
exline\nline โ†’ multi-line break
\t
Tab
Horizontal tab character (ASCII 9).
exa\tb โ†’ aโ†’b
\r
Carriage return
ASCII 13. Often paired with \n on Windows line endings.
ex\r\n โ†’ Windows EOL
\xFF
Hex code
A character given by a two-digit hexadecimal value.
ex\x41 โ†’ A
\uFFFF
Unicode code point
A Unicode character given by a four-digit hex code.
ex\u00e9 โ†’ รฉ
\0
Null character
The null byte (ASCII 0).
exrare outside binary contexts
ยง 06 ยท Chapter VI

Flags

Modifiers that change how the entire pattern is interpreted. Written after the closing slash.

i
Case insensitive
Treats upper and lower case as equivalent.
ex/hello/i โ†’ HELLO, Hello
g
Global
Find every match, not just the first.
ex/a/g โ†’ banana
m
Multiline
Makes ^ and $ match start and end of each line.
ex/^foo/m โ†’ line-by-line
s
Dotall
Lets . match newlines too.
ex/a.b/s โ†’ a\nb
u
Unicode
Enables proper Unicode handling and \p{...} properties.
ex/\p{Emoji}/u โ†’ ๐Ÿ™‚
x
Extended (free-spacing)
Ignores whitespace in the pattern, allowing comments. Not in JavaScript.
ex/\d+ # digits/x
ยง 07 ยท Chapter VII

Lookarounds

Zero-width assertions. They check what's beside the current position without consuming any characters.

(?=..)
Positive lookahead
The following text must match the inner pattern, but isn't part of the match.
ex\d+(?=px) โ†’ 42px
(?!..)
Negative lookahead
The following text must not match.
ex\d+(?!px) โ†’ 42em
(?<=..)
Positive lookbehind
The preceding text must match the inner pattern.
ex(?<=\$)\d+ โ†’ $42
(?<!..)
Negative lookbehind
The preceding text must not match.
ex(?<!\$)\d+ โ†’ qty 42
ยง 08 ยท Chapter VIII

Common Recipes

Useful starting points. Tweak to taste โ€” every dataset has its quirks.

โ„– I Email address (simplified)
/\b[\w.+-]+@[\w-]+\.[\w.-]+\b/i
โ„– II URL
/https?:\/\/[\w.-]+(?:\/[\w./?&=#%-]*)?/gi
โ„– III Hex color code
/^#[0-9a-f]{3}([0-9a-f]{3})?$/i
โ„– IV ISO date ยท YYYY-MM-DD
/\b\d{4}-\d{2}-\d{2}\b/
โ„– V 24-hour time
/^(?:[01]\d|2[0-3]):[0-5]\d$/
โ„– VI Phone ยท (555) 123-4567
/\(\d{3}\) \d{3}-\d{4}/
โ„– VII IPv4 address (loose)
/\b\d{1,3}(?:\.\d{1,3}){3}\b/
โ„– VIII Duplicate words
/\b(\w+)\s+\1\b/gi
โ„– IX Trim whitespace
/^\s+|\s+$/g
โ„– X Markdown link
/\[([^\]]+)\]\(([^)]+)\)/g
ยง 09 ยท Chapter IX

The Laboratory

Try a pattern against text. Matches are highlighted in real time.

Pattern
/ /
Subject
Results
โ€” matches