Lookarounds

You've already seen how to create custom character classes and various avatars of special groupings. In this chapter you'll learn more groupings, known as lookarounds, that help to create custom anchors and add conditions within regexp definition. These assertions are also known as zero-width patterns because they add restrictions similar to anchors and are not part of the matched portions. Also, you will learn how to negate a grouping similar to negated character sets and what's special about the \G anchor.

Conditional expressions

Before you get used to lookarounds too much, it is good to remember that Ruby is a programming language. You have control structures and you can combine multiple conditions using logical operators, methods like all?, any?, etc. Also, do not forget that regular expressions is only one of the tools available for string processing.

>> items = ['1,2,3,4', 'a,b,c,d', '#apple 123']

# filter elements containing digit and '#' characters
>> items.filter { _1.match?(/\d/) && _1.include?('#') }
=> ["#apple 123"]

# modify elements only if it doesn't start with '#'
>> items.filter_map { |s| s.sub(/,.+,/, ' ') if s[0] != '#' }
=> ["1 4", "a d"]

Negative lookarounds

Lookaround assertions can be added in two ways — lookbehind and lookahead. Each of these can be a positive or a negative assertion. Syntax wise, lookbehind has an extra < compared to the lookahead version. Negative lookarounds can be identified by the use of ! whereas = is used for positive lookarounds. This section is about negative lookarounds, whose syntax is shown below:

  • (?!pat) for negative lookahead assertion
  • (?<!pat) for negative lookbehind assertion

As mentioned earlier, lookarounds are not part of the matched portions and do not capture the matched text.

# change 'cat' only if it is not followed by a digit character
# note that the end of string satisfies this assertion
# 'catcat' has two matches as the assertion doesn't consume characters
>> 'hey cats! cat42 cat_5 catcat'.gsub(/cat(?!\d)/, 'dog')
=> "hey dogs! cat42 dog_5 dogdog"

# change 'cat' only if it is not preceded by _
# note how 'cat' at the start of string is matched as well
>> 'cat _cat 42catcat'.gsub(/(?<!_)cat/, 'dog')
=> "dog _cat 42dogdog"

# overlap example
# the final _ was replaced as well as played a part in the assertion
>> 'cats _cater 42cat_cats'.gsub(/(?<!_)cat./, 'dog')
=> "dog _cater 42dogcats"

Lookarounds can be mixed with already existing anchors and other features to define truly powerful restrictions.

# change whole word only if it is not preceded by : or --
>> ':cart apple --rest ;tea'.gsub(/(?<!:|--)\b\w+/, 'X')
=> ":cart X --rest ;X"

# extract whole words not surrounded by punctuation marks
>> 'tie. ink east;'.scan(/(?<![[:punct:]])\b\w+\b(?![[:punct:]])/)
=> ["ink"]

# add space to word boundaries, but not at the start or end of string
# similar to: gsub(/\b/, ' ').strip
>> 'output=num1+35*42/num2'.gsub(/(?<!\A)\b(?!\z)/, ' ')
=> "output = num1 + 35 * 42 / num2"

In all the examples so far, lookahead grouping was placed as a suffix and lookbehind as a prefix. This is how they are used most of the time, but not the only way to use them. Lookarounds can be placed anywhere and multiple lookarounds can be combined in any order. They do not consume characters nor do they play a role in matched portions. They just let you know whether the condition you want to test is satisfied from the current location in the input string.

# these two are equivalent
# replace a character as long as it is not preceded by 'p' or 'r'
>> 'spare'.gsub(/(?<![pr])./, '*')
=> "**a*e"
>> 'spare'.gsub(/.(?<![pr].)/, '*')
=> "**a*e"

# replace 'par' as long as 's' is not present later in the input
# this assumes that the lookaround doesn't conflict with search pattern
# i.e. 's' will not conflict 'par' but would affect if it was 'r' and 'par'
>> 'par spare part party'.gsub(/par(?!.*s)/, '[\0]')
=> "par s[par]e [par]t [par]ty"
>> 'par spare part party'.gsub(/(?!.*s)par/, '[\0]')
=> "par s[par]e [par]t [par]ty"

# since the three assertions used here are all zero-width,
# all of the 6 possible combinations will be equivalent
>> 'output=num1+35*42/num2'.gsub(/(?!\z)\b(?<!\A)/, ' ')
=> "output = num1 + 35 * 42 / num2"

Positive lookarounds

Unlike negative lookarounds, absence of something will not satisfy positive lookarounds. Instead, for the condition to satisfy, the pattern has to match actual characters and/or zero-width assertions. Positive lookarounds can be identified by the use of = in the grouping. Syntax is shown below:

  • (?=pat) for positive lookahead assertion
  • (?<=pat) for positive lookbehind assertion
# extract digits only if it is followed by ,
# end of string doesn't qualify as it is impossible to have a comma afterwards
>> '42 apple-5, fig3; x-83, y-20: f12'.scan(/\d+(?=,)/)
=> ["5", "83"]

# extract digits only if it is preceded by - and followed by ; or :
>> '42 apple-5, fig3; x-83, y-20: f12'.scan(/(?<=-)\d+(?=[;:])/)
=> ["20"]

# replace 'par' as long as 'part' occurs as a whole word later in the line
>> 'par spare part party'.gsub(/par(?=.*\bpart\b)/, '[\0]')
=> "[par] s[par]e part party"

Lookarounds can be quite handy for simple field based processing.

# except first and last fields
>> '1,two,3,four,5'.scan(/(?<=,)[^,]+(?=,)/)
=> ["two", "3", "four"]

# replace empty fields with nil
# note that in this case, the order of lookbehind and lookahead doesn't matter
# can also use: gsub(/(?<![^,])(?![^,])/, 'nil')
>> ',1,,,two,3,,'.gsub(/(?<=\A|,)(?=,|\z)/, 'nil')
=> "nil,1,nil,nil,two,3,nil,nil"

# surround all fields (which can be empty too) with {}
# there is an extra empty string match at the end of non-empty columns
>> ',cat,tiger'.gsub(/[^,]*/, '{\0}')
=> "{},{cat}{},{tiger}{}"
# lookarounds to the rescue
>> ',cat,tiger'.gsub(/(?<=\A|,)[^,]*/, '{\0}')
=> "{},{cat},{tiger}"

Capture groups inside positive lookarounds

Even though lookarounds are not part of the matched portions, capture groups can be used inside positive lookarounds. Can you reason out why it won't work for negative lookarounds?

# note also the use of double quoted string in the replacement section
>> puts 'a b c d e'.gsub(/(\S+\s+)(?=(\S+)\s)/, "\\1\\2\n")
a b
b c
c d
d e

# and of course, use non-capturing groups where needed
>> 'pore42 car3 pare7 care5'.scan(/(?<=(po|ca)re)\d+/)
=> [["po"], ["ca"]]
>> 'pore42 car3 pare7 care5'.scan(/(?<=(?:po|ca)re)\d+/)
=> ["42", "5"]

Conditional AND with lookarounds

As promised earlier, here are some examples that show how lookarounds make it simpler to construct AND conditionals.

>> words = %w[sequoia subtle questionable exhibit equation]

# words containing 'b' and 'e' and 't' in any order
# same as: /b.*e.*t|b.*t.*e|e.*b.*t|e.*t.*b|t.*b.*e|t.*e.*b/
>> words.grep(/(?=.*b)(?=.*e).*t/)
=> ["subtle", "questionable", "exhibit"]

# words containing all lowercase vowels in any order
>> words.grep(/(?=.*a)(?=.*e)(?=.*i)(?=.*o).*u/)
=> ["sequoia", "questionable", "equation"]

# words containing ('ab' or 'at') and 'q' but not 'n' at the end of the element
>> words.grep(/(?=.*a[bt])(?=.*q)(?!.*n\z)/)
=> ["questionable"]

Set start of matching portion with \K

Some of the positive lookbehind cases can be solved by adding \K as a suffix to the pattern to be asserted. The text consumed until \K won't be part of the matching portion. In other words, \K determines the starting point. The pattern before \K can be variable length too.

# similar to: /(?<=\b\w)\w*\W*/
# text matched before \K won't be replaced
>> 'sea eat car rat eel tea'.gsub(/\b\w\K\w*\W*/, '')
=> "secret"

# variable length example
>> s = 'cat scatter cater scat concatenate catastrophic catapult duplicate'
# replace only the third occurrence of 'cat'
>> s.sub(/(cat.*?){2}\Kcat/, '[\0]')
=> "cat scatter [cat]er scat concatenate catastrophic catapult duplicate"
# replace every third occurrence
>> s.gsub(/(cat.*?){2}\Kcat/, '[\0]')
=> "cat scatter [cat]er scat concatenate [cat]astrophic catapult duplicate"

Here's another example that won't work if greedy quantifier is used instead of possessive quantifier.

>> row = '421,cat,2425,42,5,cat,6,6,42,61,6,6,6,6,4'

# similar to: row.split(',').uniq.join(',')
# possessive quantifier used to ensure partial column is not captured
# if a column has the same text as another column, the latter column is deleted
>> nil while row.gsub!(/(?<=\A|,)([^,]++).*\K,\1(?=,|\z)/, '')
=> nil
>> row
=> "421,cat,2425,42,5,6,61,4"

warning Don't use \K with gsub or scan if the string to match after \K can be empty. This is how the regexp engine has been implemented, other libraries like PCRE don't have this limitation. See stackoverflow: \K in ruby for some more details on this topic.

# [^,]*+ can match empty field, so use lookaround instead of \K
>> ',cat,tiger'.gsub(/(?<=\A|,)[^,]*+/, '{\0}')
=> "{},{cat},{tiger}"
>> ',cat,tiger'.gsub(/(?:\A|,)\K[^,]*+/, '{\0}')
=> "{},cat,{tiger}"

# another example with nothing to be matched after \K
>> 'abcd 123456'.gsub(/(?<=\w)/, ':')
=> "a:b:c:d: 1:2:3:4:5:6:"
>> 'abcd 123456'.gsub(/\w/, '\0:')
=> "a:b:c:d: 1:2:3:4:5:6:"
>> 'abcd 123456'.gsub(/\w\K/, ':')
=> "a:bc:d 1:23:45:6"

Variable length lookbehind

The pattern used for lookbehind assertions (either positive or negative) cannot imply matching variable length of text. Using fixed length quantifier or alternations of different lengths (but each alternation being fixed length) is allowed. For some reason, alternations of different lengths inside a group is not allowed. Here are some examples to clarify these points:

>> s = 'pore42 tar3 dare7 care5'

# allowed
>> s.scan(/(?<=(?:po|da)re)\d+/)
=> ["42", "7"]
>> s.scan(/(?<=\b[a-z]{4})\d+/)
=> ["42", "7", "5"]
>> s.scan(/(?<!tar|dare)\d+/)
=> ["42", "5"]

# not allowed
>> s.scan(/(?<=(?:o|ca)re)\d+/)
invalid pattern in look-behind: /(?<=(?:o|ca)re)\d+/ (SyntaxError)
>> s.scan(/(?<=\b[a-z]+)\d+/)
invalid pattern in look-behind: /(?<=\b[a-z]+)\d+/ (SyntaxError)

There are various workarounds possible depending upon the use case. Some of the positive lookbehind cases can be solved using \K as seen in the previous section (but note that \K isn't a zero-width assertion). For some cases, you can skip lookbehind entirely and just use normal groupings. This works even when you don't know the length of patterns.

>> s = 'pore42 tar3 dare7 care5'

# examples where lookbehind won't give error
# same as: s.scan(/(?<=tar|dare)\d+/)
>> s.gsub(/(?:tar|dare)(\d+)/).map { $1 }
=> ["3", "7"]
# delete digits only if they are preceded by 'tar' or 'dare'
# same as: s.gsub(/(?<=tar|dare)\d+/, '')
>> s.gsub(/(tar|dare)\d+/, '\1')
=> "pore42 tar dare care5"

# examples where lookbehind will give error
# workaround for /(?<=\b[pd][a-z]*)\d+/
# get digits only if they are preceded by a word starting with 'p' or 'd'
>> s.gsub(/\b[pd][a-z]*(\d+)/).map { $1 }
=> ["42", "7"]
# delete digits only if they are preceded by a word starting with 'p' or 'd'
>> s.gsub(/(\b[pd][a-z]*)\d+/, '\1')
=> "pore tar3 dare care5"

However, if you don't know the lengths for negative lookbehind, you cannot use the above workarounds. The next section will show how to negate a grouping, and that helps for some of the variable negative lookbehind cases.

Negated groups and the absence operator

Some of the variable length negative lookbehind cases can be simulated by using a negative lookahead (which doesn't have restriction on variable length). The trick is to assert negative lookahead one character at a time and applying quantifiers on such a grouping to satisfy the variable requirement. This will only work if you have well defined conditions before the negated group.

# match 'dog' only if it is not preceded by 'cat'
# note the use of \A anchor to force matching all characters up to 'dog'
# cannot use /(?<!cat.*)dog/ as variable length lookbehind is not allowed
>> 'fox,cat,dog,parrot'.match?(/\A((?!cat).)*dog/)
=> false
# match 'dog' only if it is not preceded by 'parrot'
>> 'fox,cat,dog,parrot'.match?(/\A((?!parrot).)*dog/)
=> true

# use non-capturing group if required
>> words = 'apple banana 12_bananas cherry fig mango cake42'
>> words.scan(/\b[a-z](?:(?!pp|rr)[a-z])*\b/)
=> ["banana", "fig", "mango"]

Check the matched portions for easier understanding of negated groups:

>> 'fox,cat,dog,parrot'[/\A((?!cat).)*/]
=> "fox,"
>> 'fox,cat,dog,parrot'[/\A((?!parrot).)*/]
=> "fox,cat,dog,"
>> 'fox,cat,dog,parrot'[/\A(?:(?!(.)\1).)*/]
=> "fox,cat,dog,pa"

There's an alternate syntax that can be used for cases where the grouping to be negated is bound on both sides by another regexp, anchor, etc. It is known as absence operator and the syntax is (?~pat).

# match if 'do' is not there between 'at' and 'par'
# note that quantifier is not used, absence operator takes care of it
# same as: /at((?!do).)*par/
>> 'fox,cat,dog,parrot'.match?(/at(?~do)par/)
=> false

# match if 'go' is not there between 'at' and 'par'
>> 'fox,cat,dog,parrot'.match?(/at(?~go)par/)
=> true
>> 'fox,cat,dog,parrot'[/at(?~go)par/]
=> "at,dog,par"

\G anchor

The \G anchor matches the start of the input string, just like the \A anchor. In addition, it will also match at the end of the previous match. This helps you to mark a particular location in the input string and continue from there instead of having the pattern to always check for the specific location. This is best understood with examples.

First, a simple example of using \G without alternations. The goal is to replace every character of the first field with * where whitespace is the field separator.

>> record = '123-456-7890 Joe (30-40) years'

# simply using \S will replace all the non-whitespace characters
>> record.gsub(/\S/, '*')
=> "************ *** ******* *****"
# naively adding the \A anchor replaces only the first one
>> record.gsub(/\A\S/, '*')
=> "*23-456-7890 Joe (30-40) years"

# \G to the rescue!
>> record.gsub(/\G\S/, '*')
=> "************ Joe (30-40) years"
>> record.scan(/\G\S/)
=> ["1", "2", "3", "-", "4", "5", "6", "-", "7", "8", "9", "0"]

In the above example, \G will first match the start of the string. So, the first character is replaced with * since \S matches the non-whitespace character 1. The ending of 1 will now be considered as the new anchor for \G. The second character will then match because 2 is a non-whitespace character and \G assertion is satisfied due to the previous match. This will continue until the end of the field, which is 0 in the above example. When the next character is considered, \G assertion is still satisfied but \S fails due to the space character. Because the matching failed, \G will not be satisfied when the next character J is considered. So, no more characters can match since this particular example doesn't provide an alternate way for \G to be reactivated.

Here are some more examples of using \G without alternations:

# all digits and optional hyphen combo from start of string
>> record = '123-456-7890 Joe (30-40) years'
>> record.scan(/\G\d+-?/)
=> ["123-", "456-", "7890"]
>> record.gsub(/\G(\d+)(-?)/, '[\1]\2')
=> "[123]-[456]-[7890] Joe (30-40) years"

# all word characters from the start of string
# only if it is followed by a word character
>> 'cat_12 bat_100 kite_42'.scan(/\G\w(?=\w)/)
=> ["c", "a", "t", "_", "1"]
>> 'cat_12 bat_100 kite_42'.gsub(/\G\w(?=\w)/, '\0:')
=> "c:a:t:_:1:2 bat_100 kite_42"

# all lowercase alphabets or space from the start of string
>> 'par tar-den hen-food mood'.gsub(/\G[a-z ]/, '(\0)')
=> "(p)(a)(r)( )(t)(a)(r)-den hen-food mood"

Next, using \G as part of alternations so that it can be activated anywhere in the input string. Suppose you need to extract one or more numbers that follow a particular name. Here's one way to solve it:

>> marks = 'Joe 75 88 Mina 89 85 84 John 90'

>> marks.scan(/(?:Mina|\G) \K\d+/)
=> ["89", "85", "84"]

>> marks.scan(/(?:Joe|\G) \K\d+/)
=> ["75", "88"]

>> marks.scan(/(?:John|\G) \K\d+/)
=> ["90"]

In the above example, \G matches the start of the string but the input string doesn't start with a space character. So the regular expression can be satisfied only after the other alternative is matched. Consider the first pattern where Mina is the other alternative. Once that string is found, a space and digit characters will satisfy the rest of the regexp. Ending of the match, i.e. Mina 89 in this case, will now be the \G anchoring position. This will allow 85 and 84 to be matched subsequently. After that, J fails the \d pattern and no more matches are possible (as Mina isn't found another time).

In some cases, \G anchoring at the start of the string will cause issues. One workaround is to add a negative lookaround assertion. Here's an example. Goal is to mask the password only for the given name.

>> passwords = 'Rohit:hunter2 Ram:123456 Ranjit:abcdef'

# the first space separated field is also getting masked here
>> passwords.gsub(/(?:Ram:\K|\G)\S/, '*')
=> "************* Ram:****** Ranjit:abcdef"

# adding a negative assertion helps
>> passwords.gsub(/(?:Ram:\K|\G(?!\A))\S/, '*')
=> "Rohit:hunter2 Ram:****** Ranjit:abcdef"
>> passwords.gsub(/(?:Rohit:\K|\G(?!\A))\S/, '*')
=> "Rohit:******* Ram:123456 Ranjit:abcdef"

Cheatsheet and Summary

NoteDescription
lookaroundscustom assertions, zero-width like anchors
(?!pat)negative lookahead assertion
(?<!pat)negative lookbehind assertion
(?=pat)positive lookahead assertion
(?<=pat)positive lookbehind assertion
(?!pat1)(?=pat2)multiple assertions can be specified next to each other in any order
as they mark a matching location without consuming characters
pat\Kpat won't be part of the matching portion
\K can be used for some of the positive lookbehind cases
((?!pat).)*Negate a grouping, similar to negated character class
helpful to emulate some variable length negative lookbehind cases
(?~pat)absence operator
similar to ((?!pat).)* if bounded on both sides
\Grestricts the matching from the start of string like \A
continues matching from the end of previous match as the new anchor
ex: '12-34 42'.scan(/\G\d+-?/) gives ["12-", "34"]

In this chapter, you learnt how to use lookarounds to create custom restrictions and also how to use negated grouping. You also learnt about the \G anchor. With this, most of the powerful features of regexp have been covered. The special groupings seem never ending though, there are some more of them in the coming chapters!!

Exercises

info Please use lookarounds for solving the following exercises even if you can do it without lookarounds. Unless you cannot use lookarounds for cases like variable length lookbehinds.

1) Replace all whole words with X unless it is preceded by a ( character.

>> ip = '(apple) guava berry) apple (mango) (grape'

##### add your solution here
=> "(apple) X X) X (mango) (grape"

2) Replace all whole words with X unless it is followed by a ) character.

>> ip = '(apple) guava berry) apple (mango) (grape'

##### add your solution here
=> "(apple) X berry) X (mango) (X"

3) Replace all whole words with X unless it is preceded by ( or followed by ) characters.

>> ip = '(apple) guava berry) apple (mango) (grape'

##### add your solution here
=> "(apple) X berry) X (mango) (grape"

4) Extract all whole words that do not end with e or n.

>> ip = 'a_t row on Urn e note Dust n end a2-e|u'

##### add your solution here
=> ["a_t", "row", "Dust", "end", "a2", "u"]

5) Extract all whole words that do not start with a or d or n.

>> ip = 'a_t row on Urn e note Dust n end a2-e|u'

##### add your solution here
=> ["row", "on", "Urn", "e", "Dust", "end", "e", "u"]

6) Extract all whole words only if they are followed by : or , or -.

>> ip = 'Poke,on=-=so_good:ink.to/is(vast)ever2-sit'

##### add your solution here
=> ["Poke", "so_good", "ever2"]

7) Extract all whole words only if they are preceded by = or / or -.

>> ip = 'Poke,on=-=so_good:ink.to/is(vast)ever2-sit'

##### add your solution here
=> ["so_good", "is", "sit"]

8) Extract all whole words only if they are preceded by = or : and followed by : or ..

>> ip = 'Poke,on=-=so_good:ink.to/is(vast)ever2-sit'

##### add your solution here
=> ["so_good", "ink"]

9) Extract all whole words only if they are preceded by = or : or . or ( or - and not followed by . or /.

>> ip = 'Poke,on=-=so_good:ink.to/is(vast)ever2-sit'

##### add your solution here
=> ["so_good", "vast", "sit"]

10) Remove the leading and trailing whitespaces from all the individual fields where , is the field separator.

>> csv1 = " comma  ,separated ,values \t\r "
>> csv2 = 'good bad,nice  ice  , 42 , ,   stall   small'

>> remove_whitespace =      ##### add your solution here

>> csv1.gsub(remove_whitespace, '')
=> "comma,separated,values"
>> csv2.gsub(remove_whitespace, '')
=> "good bad,nice  ice,42,,stall   small"

11) Filter elements that satisfy all of these rules:

  • should have at least two alphabets
  • should have at least three digits
  • should have at least one special character among % or * or # or $
  • should not end with a whitespace character
>> pwds = ['hunter2', 'F2H3u%9', "*X3Yz3.14\t", 'r2_d2_42', 'A $B C1234']

>> rule_chk =       ##### add your solution here

>> pwds.grep(rule_chk)
=> ["F2H3u%9", "A $B C1234"]

12) For the given string, surround all whole words with {} except for whole words par and cat and apple.

>> ip = 'part; cat {super} rest_42 par scatter apple spar'

##### add your solution here
=> "{part}; cat {{super}} {rest_42} par {scatter} apple {spar}"

13) Extract the integer portion of floating-point numbers for the given string. Integers and numbers ending with . and no further digits should not be considered.

>> ip = '12 ab32.4 go 5 2. 46.42 5'

##### add your solution here
=> ["32", "46"]

14) For the given input strings, extract all overlapping two character sequences.

>> s1 = 'apple'
>> s2 = '1.2-3:4'

>> pat =        ##### add your solution here

##### add your solution here for s1
=> ["ap", "pp", "pl", "le"]
##### add your solution here for s2
=> ["1.", ".2", "2-", "-3", "3:", ":4"]

15) The given input strings contain fields separated by the : character. Delete : and the last field if there is a digit character anywhere before the last field.

>> s1 = '42:cat'
>> s2 = 'twelve:a2b'
>> s3 = 'we:be:he:0:a:b:bother'
>> s4 = 'apple:banana-42:cherry:'
>> s5 = 'dragon:unicorn:centaur'

>> pat =        ##### add your solution here

##### add your solution here for s1
=> "42"
##### add your solution here for s2
=> "twelve:a2b"
##### add your solution here for s3
=> "we:be:he:0:a:b"
##### add your solution here for s4
=> "apple:banana-42:cherry"
##### add your solution here for s5
=> "dragon:unicorn:centaur"

16) Extract all whole words unless they are preceded by : or <=> or ---- or #.

>> ip = '::very--at<=>row|in.a_b#b2c=>lion----east'

##### add your solution here
=> ["at", "in", "a_b", "lion"]

17) Match strings if it contains qty followed by price but not if there is any whitespace character or the string error between them.

>> str1 = '23,qty,price,42'
>> str2 = 'qty price,oh'
>> str3 = '3.14,qty,6,errors,9,price,3'
>> str4 = "42\nqty-6,apple-56,price-234,error"
>> str5 = '4,price,3.14,qty,4'
>> str6 = '(qtyprice) (hi-there)'

>> neg =        ##### add your solution here

>> str1.match?(neg)
=> true
>> str2.match?(neg)
=> false
>> str3.match?(neg)
=> false
>> str4.match?(neg)
=> true
>> str5.match?(neg)
=> false
>> str6.match?(neg)
=> true

18) Can you reason out why the following regular expressions behave differently?

>> ip = 'I have 12, he has 2!'

>> ip.gsub(/\b..\b/, '{\0}')
=> "{I }have {12}{, }{he} has{ 2}!"

>> ip.gsub(/(?<!\w)..(?!\w)/, '{\0}')
=> "I have {12}, {he} has {2!}"

19) The given input strings have fields separated by the : character. Assume that each string has a minimum of two fields and cannot have empty fields. Extract all fields, but stop if a field with a digit character is found.

>> row1 = 'vast:a2b2:ride:in:awe:b2b:3list:end'
>> row2 = 'um:no:low:3e:s4w:seer'
>> row3 = 'oh100:apple:banana:fig'
>> row4 = 'Dragon:Unicorn:Wizard-Healer'

>> pat =        ##### add your solution here

>> row1.gsub(pat).map { $1 }
=> ["vast"]
>> row2.gsub(pat).map { $1 }
=> ["um", "no", "low"]
>> row3.gsub(pat).map { $1 }
=> []
>> row4.gsub(pat).map { $1 }
=> ["Dragon", "Unicorn", "Wizard-Healer"]

20) The given input strings have fields separated by the : character. Extract all fields only after a field containing a digit character is found. Assume that each string has a minimum of two fields and cannot have empty fields.

>> row1 = 'vast:a2b2:ride:in:awe:b2b:3list:end'
>> row2 = 'um:no:low:3e:s4w:seer'
>> row3 = 'oh100:apple:banana:fig'
>> row4 = 'Dragon:Unicorn:Wizard-Healer'

>> pat =        ##### add your solution here

>> row1.scan(pat)
=> ["ride", "in", "awe", "b2b", "3list", "end"]
>> row2.scan(pat)
=> ["s4w", "seer"]
>> row3.scan(pat)
=> ["apple", "banana", "fig"]
>> row4.scan(pat)
=> []

21) The given input string has comma separated fields and some of them can occur more than once. For the duplicated fields, retain only the rightmost one. Assume that there are no empty fields.

>> row = '421,cat,2425,42,5,cat,6,6,42,61,6,6,scat,6,6,4,Cat,425,4'

##### add your solution here
=> "421,2425,5,cat,42,61,scat,6,Cat,425,4"