Alternation and Grouping
Many a times, you want to check if the input string matches multiple patterns. For example, whether a product's color is green or blue or red. This chapter will show how to use alternation for such cases. These patterns can have some common elements between them, in which case grouping helps to form terser regexps. This chapter will also discuss the precedence rules used to determine which alternation wins.
Alternation
A conditional expression combined with logical OR evaluates to true
if any of the conditions is satisfied. Similarly, in regular expressions, you can use the |
metacharacter to combine multiple patterns to indicate logical OR. The matching will succeed if any of the alternate patterns is found in the input string. These alternatives have the full power of a regular expression, for example they can have their own independent anchors. Here are some examples.
// match either 'cat' or 'dog'
> const pets = /cat|dog/
> pets.test('I like cats')
< true
> pets.test('I like dogs')
< true
> pets.test('I like parrots')
< false
// replace either 'cat' at the start of string or 'cat' at the end of word
> 'catapults concatenate cat scat cater'.replace(/^cat|cat\b/g, 'X')
< 'Xapults concatenate X sX cater'
// replace 'cat' or 'dog' or 'fox' with 'mammal'
> 'cat dog bee parrot fox'.replace(/cat|dog|fox/g, 'mammal')
< 'mammal mammal bee parrot mammal'
You might infer from the above examples that there can be situations where many alternations are required. See the Dynamically building alternation section for examples and details.
Grouping
Often, there are some common portions among the regexp alternatives. It could be common characters, qualifiers like the anchors and so on. In such cases, you can group them using a pair of parentheses metacharacters. Similar to a(b+c)d = abd+acd
in maths, you get a(b|c)d = abd|acd
in regular expressions.
// without grouping
> 'red reform read arrest'.replace(/reform|rest/g, 'X')
< 'red X read arX'
// with grouping
> 'red reform read arrest'.replace(/re(form|st)/g, 'X')
< 'red X read arX'
// without grouping
> 'par spare part party'.replace(/\bpar\b|\bpart\b/g, 'X')
< 'X spare X party'
// taking out common anchors
> 'par spare part party'.replace(/\b(par|part)\b/g, 'X')
< 'X spare X party'
// taking out common characters as well
// you'll later learn a better technique instead of using empty alternates
> 'par spare part party'.replace(/\bpar(|t)\b/g, 'X')
< 'X spare X party'
There are many more uses for grouping than just forming a terser regexp. They will be discussed as they become relevant in the coming chapters.
Precedence rules
There are tricky situations when using alternation. There is no ambiguity if it is used to get a boolean result by testing a match against a string input. However, for cases like string replacement, it depends on a few factors. Say, you want to replace either are
or spared
— which one should get precedence? The bigger word spared
or the substring are
inside it or based on something else?
The regexp alternative which matches earliest in the input string gets precedence.
> let words = 'lion elephant are rope not'
// starting index of 'on' < index of 'ant' for the given string input
// so 'on' will be replaced irrespective of the order of alternations
> words.replace(/on|ant/, 'X')
< 'liX elephant are rope not'
> words.replace(/ant|on/, 'X')
< 'liX elephant are rope not'
What happens if the alternatives have the same starting index? The precedence is left-to-right in the order of declaration.
> let mood = 'best years'
// starting index for 'year' and 'years' will always be the same
// so, which one gets replaced depends on the order of alternations
> mood.replace(/year|years/, 'X')
< 'best Xs'
> mood.replace(/years|year/, 'X')
< 'best X'
Here's another example to drive home the issue.
> let sample = 'ear xerox at mare part learn eye'
// this is going to be same as: replace(/ar/g, 'X')
> sample.replace(/ar|are|art/g, 'X')
< 'eX xerox at mXe pXt leXn eye'
// this is going to be same as: replace(/are|ar/g, 'X')
> sample.replace(/are|ar|art/g, 'X')
< 'eX xerox at mX pXt leXn eye'
// phew, finally this one works as needed
> sample.replace(/are|art|ar/g, 'X')
< 'eX xerox at mX pX leXn eye'
Cheatsheet and Summary
Note | Description |
---|---|
pat1|pat2|pat3 | multiple regexp combined as conditional OR |
each alternative can have independent anchors | |
() | group pattern(s) |
a(b|c)d | same as abd|acd |
Alternation precedence | pattern which matches earliest in the input gets precedence |
tie-breaker is left to right if matches have the same starting location |
So, this chapter was about specifying one or more alternate matches within the same regexp using the |
metacharacter. Which can further be simplified using ()
grouping if the alternations have common portions. Among the alternations, the earliest matching pattern gets precedence. Left-to-right ordering is used as a tie-breaker if multiple alternations have the same starting location. In the next chapter, you'll learn how to construct an alternation pattern from an array of strings taking care of precedence rules. Grouping has various other uses too, which will be discussed in the coming chapters.
Exercises
1) For the given input array, filter all elements that start with den
or end with ly
.
> let items = ['lovely', '1\ndentist', '2 lonely', 'eden', 'fly\n', 'dent']
> items.filter() // add your solution here
< ['lovely', '2 lonely', 'dent']
2) For the given array, filter all elements having a line starting with den
or ending with ly
.
> let items = ['lovely', '1\ndentist', '2 lonely', 'eden', 'fly\nfar', 'dent']
> items.filter() // add your solution here
< ['lovely', '1\ndentist', '2 lonely', 'fly\nfar', 'dent']
3) For the given input strings, replace all occurrences of removed
or reed
or received
or refused
with X
.
> let s1 = 'creed refuse removed read'
> let s2 = 'refused reed redo received'
> const pat1 = // add your solution here
> s1.replace(pat1, 'X')
< 'cX refuse X read'
> s2.replace(pat1, 'X')
< 'X X redo X'
4) For the given input strings, replace late
or later
or slated
with A
.
> let str1 = 'plate full of slate'
> let str2 = "slated for later, don't be late"
> const pat2 = // add your solution here
> str1.replace(pat2, 'A')
< 'pA full of sA'
> str2.replace(pat2, 'A')
< "A for A, don't be A"