Alternation and Grouping

Many a times, you want to check if the input string matches multiple patterns. For example, whether a car color is green or blue or red. In programming terms, you need to perform OR conditional. This chapter will show how to use alternation for such cases. These patterns can also have some common elements between them, in which case grouping helps to form terser regexps. This chapter will also discuss the precedence rules used to determine which alternation wins.

OR conditional

In a conditional expression, you can use the logical operators to combine multiple conditions. With regular expressions, the | metacharacter is similar to logical OR. The regexp will match if any of the expression separated by | is satisfied. Each of these alternations is a full regexp. For example, anchors are specific to that particular alternation.

// match either 'cat' or 'dog'
> const pets = /cat|dog/
> pets.test('I like cats')
< true
> pets.test('I like dogs')
< true
> pets.test('I like parrots')
< false

// replace either 'cat' at start of string or 'cat' at end of word
> 'catapults concatenate cat scat'.replace(/^cat|cat\b/g, 'X')
< "Xapults concatenate X sX"
// replace either 'cat' or 'dog' or 'fox' with 'mammal'
> 'cat dog bee parrot fox'.replace(/cat|dog|fox/g, 'mammal')
< "mammal mammal bee parrot mammal"

info You might infer from the above examples that there can be situations where many alternations are required. See Dynamically building alternation section for examples and details.

Grouping

Often, there are some common things among the regexp alternatives. It could be common characters or regexp qualifiers like the anchors. In such cases, you can group them using a pair of parentheses metacharacters. Similar to a(b+c)d = abd+acd in maths, you get a(b|c)d = abd|acd in regular expressions.

// without grouping
> 'red reform read arrest'.replace(/reform|rest/g, 'X')
< "red X read arX"
// with grouping
> 'red reform read arrest'.replace(/re(form|st)/g, 'X')
< "red X read arX"

// without grouping
> 'par spare part party'.replace(/\bpar\b|\bpart\b/g, 'X')
< "X spare X party"
// taking out common anchors
> 'par spare part party'.replace(/\b(par|part)\b/g, 'X')
< "X spare X party"
// taking out common characters as well
// you'll later learn a better technique instead of using empty alternate
> 'par spare part party'.replace(/\bpar(|t)\b/g, 'X')
< "X spare X party"

info There's plenty more features to grouping than just forming terser regexp. It will be discussed as they become relevant in coming chapters.

Precedence rules

There's some tricky situations when using alternation. If it is used for testing a match to get true/false against a string input, there is no ambiguity. However, for other things like string replacement, it depends on a few factors. Say, you want to replace either are or spared — which one should get precedence? The bigger word spared or the substring are inside it or based on something else?

The regexp alternative which matches earliest in the input string gets precedence.

> let words = 'lion elephant are rope not'

// starting index of 'on' < index of 'ant' for given string input
// so 'on' will be replaced irrespective of order of alternations
> words.replace(/on|ant/, 'X')
< "liX elephant are rope not"
> words.replace(/ant|on/, 'X')
< "liX elephant are rope not"

So, what happens if two or more alternatives match on same index? The precedence is then left to right in the order of declaration.

> let mood = 'best years'

// starting index for 'year' and 'years' will always be same
// so, which one gets replaced depends on the order of alternations
> mood.replace(/year|years/, 'X')
< "best Xs"
> mood.replace(/years|year/, 'X')
< "best X"

Another example with replace to drive home the issue.

> let sample = 'ear xerox at mare part learn eye'

// this is going to be same as: replace(/ar/g, 'X')
> sample.replace(/ar|are|art/g, 'X')
< "eX xerox at mXe pXt leXn eye"
// this is going to be same as: replace(/are|ar/g, 'X')
> sample.replace(/are|ar|art/g, 'X')
< "eX xerox at mX pXt leXn eye"
// phew, finally this one works as expected
> sample.replace(/are|art|ar/g, 'X')
< "eX xerox at mX pX leXn eye"

Cheatsheet and Summary

NoteDescription
pat1|pat2|pat3multiple regexp combined as OR conditional
each alternative can have independent anchors
()group pattern(s)
a(b|c)dsame as abd|acd
Alternation precedencepattern which matches earliest in the input gets precedence
tie-breaker is left to right if matches have same starting location

So, this chapter was about specifying one or more alternate matches within the same regexp using | metacharacter. Which can further be simplified using () grouping if there are common aspects. Among the alternations, earliest matching pattern gets precedence. Left to right ordering is used as a tie-breaker if multiple alternations match starting from the same location. In the next chapter, you'll learn how to construct alternation from an array of strings taking care of precedence rules. Grouping has various other uses too, which will be discussed in coming chapters.

Exercises

a) For the given input array, filter all elements that start with den or end with ly

> let items = ['lovely', '1\ndentist', '2 lonely', 'eden', 'fly\n', 'dent']

> items.filter()        // add your solution here
< ["lovely", "2 lonely", "dent"]

b) For the given array, filter all elements having a line starting with den or ending with ly

> let items = ['lovely', '1\ndentist', '2 lonely', 'eden', 'fly\nfar', 'dent']

> items.filter()        // add your solution here
< ["lovely", "1\ndentist", "2 lonely", "fly\nfar", "dent"]

c) For the given input strings, replace all occurrences of removed or reed or received or refused with X.

> let s1 = 'creed refuse removed read'
> let s2 = 'refused reed redo received'

> const pat1 =      // add your solution here

> s1.replace(pat1, 'X')
< "cX refuse X read"
> s2.replace(pat1, 'X')
< "X X redo X"

d) For the given input strings, replace late or later or slated with A.

> let str1 = 'plate full of slate'
> let str2 = "slated for later, don't be late"

> const pat2 =      // add your solution here

> str1.replace(pat2, 'A')
< "pA full of sA"
> str2.replace(pat2, 'A')
< "A for A, don't be A"