Exercise solutions
Try to solve the exercises in every chapter using only the features discussed until that chapter. Some of the exercises will be easier to solve with techniques presented in the later chapters, but the aim of these exercises is to explore the features presented so far.
RegExp introduction
1) Check if the given input strings contain two
irrespective of case.
> let s1 = 'Their artwork is exceptional'
> let s2 = 'one plus tw0 is not three'
> let s3 = 'TRUSTWORTHY'
> const pat1 = /two/i
> pat1.test(s1)
< true
> pat1.test(s2)
< false
> pat1.test(s3)
< true
2) For the given array, filter all elements that do not contain e
.
> let items = ['goal', 'new', 'user', 'sit', 'eat', 'dinner']
> items.filter(w => !/e/.test(w))
< ['goal', 'sit']
3) Replace only the first occurrence of 5
with five
for the given string.
> let ip = 'They ate 5 apples and 5 oranges'
> ip.replace(/5/, 'five')
< 'They ate five apples and 5 oranges'
4) Replace all occurrences of 5
with five
for the given string.
> let ip = 'They ate 5 apples and 5 oranges'
> ip.replace(/5/g, 'five')
< 'They ate five apples and five oranges'
5) Replace all occurrences of note
irrespective of case with X
.
> let ip = 'This note should not be NoTeD'
> ip.replace(/note/ig, 'X')
< 'This X should not be XD'
6) For the given multiline input string, filter all lines NOT containing the string 2
.
> let purchases = `items qty
apple 24
mango 50
guava 42
onion 31
water 10`
> const num = /2/
> console.log(purchases.split('\n')
.filter(e => !num.test(e))
.join('\n'))
< items qty
mango 50
onion 31
water 10
You'd be able to solve this using just the
replace()
method by the end of the Dot metacharacter and Quantifiers chapter.
7) For the given array, filter all elements that contain either a
or w
.
> let items = ['goal', 'new', 'user', 'sit', 'eat', 'dinner']
> items.filter(w => /a/.test(w) || /w/.test(w))
< ['goal', 'new', 'eat']
8) For the given array, filter all elements that contain both e
and n
.
> let items = ['goal', 'new', 'user', 'sit', 'eat', 'dinner']
> items.filter(w => /e/.test(w) && /n/.test(w))
< ['new', 'dinner']
9) For the given string, replace 0xA0
with 0x7F
and 0xC0
with 0x1F
.
> let ip = 'start address: 0xA0, func1 address: 0xC0'
> ip.replace(/0xA0/, '0x7F').replace(/0xC0/, '0x1F')
< 'start address: 0x7F, func1 address: 0x1F'
Anchors
1) Check if the given input strings contain is
or the
as whole words.
> let str1 = 'is; (this)'
> let str2 = "The food isn't good"
> let str3 = 'the2 cats'
> let str4 = 'switch on the light'
> const pat1 = /\bis\b/
> const pat2 = /\bthe\b/
> pat1.test(str1) || pat2.test(str1)
< true
> pat1.test(str2) || pat2.test(str2)
< false
> pat1.test(str3) || pat2.test(str3)
< false
> pat1.test(str4) || pat2.test(str4)
< true
2) For the given input string, change only the whole word red
to brown
.
> let ip = 'bred red spread credible red;'
> ip.replace(/\bred\b/g, 'brown')
< 'bred brown spread credible brown;'
3) For the given array, filter all elements that contain 42
surrounded by word characters.
> let items = ['hi42bye', 'nice1423', 'bad42', 'cool_42a', 'fake4b']
> items.filter(e => /\B42\B/.test(e))
< ['hi42bye', 'nice1423', 'cool_42a']
4) For the given input array, filter all elements that start with den
or end with ly
.
> let items = ['lovely', '1\ndentist', '2 lonely', 'eden', 'fly\n', 'dent']
> items.filter(e => /^den/.test(e) || /ly$/.test(e))
< ['lovely', '2 lonely', 'dent']
5) For the given input string, change whole word mall
to 1234
only if it is at the start of a line.
> let para = `(mall) call ball pall
ball fall wall tall
mall call ball pall
wall mall ball fall
mallet wallet malls
mall:call:ball:pall`
> console.log(para.replace(/^mall\b/gm, '1234'))
< (mall) call ball pall
ball fall wall tall
1234 call ball pall
wall mall ball fall
mallet wallet malls
1234:call:ball:pall
6) For the given array, filter all elements having a line starting with den
or ending with ly
.
> let items = ['lovely', '1\ndentist', '2 lonely', 'eden', 'fly\nfar', 'dent']
> items.filter(e => /^den/m.test(e) || /ly$/m.test(e))
< ['lovely', '1\ndentist', '2 lonely', 'fly\nfar', 'dent']
7) For the given input array, filter all whole elements 12\nthree
irrespective of case.
> let items = ['12\nthree\n', '12\nThree', '12\nthree\n4', '12\nthree']
> items.filter(e => /^12\nthree$/i.test(e))
< ['12\nThree', '12\nthree']
8) For the given input array, replace hand
with X
for all elements that start with hand
followed by at least one word character.
> let items = ['handed', 'hand', 'handy', 'un-handed', 'handle', 'hand-2']
> items.map(w => w.replace(/^hand\B/, 'X'))
< ['Xed', 'hand', 'Xy', 'un-handed', 'Xle', 'hand-2']
9) For the given input array, filter all elements starting with h
. Additionally, replace e
with X
for these filtered elements.
> let items = ['handed', 'hand', 'handy', 'unhanded', 'handle', 'hand-2']
> items.filter(w => /^h/.test(w)).map(w => w.replace(/e/g, 'X'))
< ['handXd', 'hand', 'handy', 'handlX', 'hand-2']
10) Why does the following code show false
instead of true
?
Because $
matches only the end of string. You'll have to use the m
flag to enable matching at the end of line separators. Some regular expression engines do allow $
to match just before \n
if it is the last character in the string, but not JavaScript.
> /end$/.test('bend it\nand send\n')
< false
> /end$/m.test('bend it\nand send\n')
< true
Alternation and Grouping
1) For the given input array, filter all elements that start with den
or end with ly
.
> let items = ['lovely', '1\ndentist', '2 lonely', 'eden', 'fly\n', 'dent']
> items.filter(e => /^den|ly$/.test(e))
< ['lovely', '2 lonely', 'dent']
2) For the given array, filter all elements having a line starting with den
or ending with ly
.
> let items = ['lovely', '1\ndentist', '2 lonely', 'eden', 'fly\nfar', 'dent']
> items.filter(e => /^den|ly$/m.test(e))
< ['lovely', '1\ndentist', '2 lonely', 'fly\nfar', 'dent']
3) For the given input strings, replace all occurrences of removed
or reed
or received
or refused
with X
.
> let s1 = 'creed refuse removed read'
> let s2 = 'refused reed redo received'
> const pat1 = /re(mov|ceiv|fus|)ed/g
> s1.replace(pat1, 'X')
< 'cX refuse X read'
> s2.replace(pat1, 'X')
< 'X X redo X'
4) For the given input strings, replace late
or later
or slated
with A
.
> let str1 = 'plate full of slate'
> let str2 = "slated for later, don't be late"
> const pat2 = /slated|late(r|)/g
> str1.replace(pat2, 'A')
< 'pA full of sA'
> str2.replace(pat2, 'A')
< "A for A, don't be A"
Escaping metacharacters
1) Transform the given input strings to the expected output using the same logic on both strings.
> let str1 = '(9-2)*5+qty/3-(9-2)*7'
> let str2 = '(qty+4)/2-(9-2)*5+pq/4'
> const pat1 = /\(9-2\)\*5/g
> str1.replace(pat1, '35')
< '35+qty/3-(9-2)*7'
> str2.replace(pat1, '35')
< '(qty+4)/2-35+pq/4'
2) Replace (4)\|
with 2
only at the start or end of the given input strings.
> let s1 = '2.3/(4)\\|6 fig 5.3-(4)\\|'
> let s2 = '(4)\\|42 - (4)\\|3'
> let s3 = 'two - (4)\\|\n'
> const pat2 = /^\(4\)\\\||\(4\)\\\|$/g
> s1.replace(pat2, '2')
< '2.3/(4)\\|6 fig 5.3-2'
> s2.replace(pat2, '2')
< '242 - (4)\\|3'
> s3.replace(pat2, '2')
< 'two - (4)\\|\n'
3) Replace any matching element from the array items
with X
for given the input strings. Match the elements from items
literally. Assume no two elements of items
will result in any matching conflict.
> let items = ['a.b', '3+n', 'x\\y\\z', 'qty||price', '{n}']
> function escapeRegExp(string) {
return string.replace(/[.*+\-?^${}()|[\]\\]/g, '\\$&')
}
> function unionRegExp(arr) {
return arr.map(w => escapeRegExp(w)).join('|')
}
> const pat3 = new RegExp(unionRegExp(items), 'g')
> '0a.bcd'.replace(pat3, 'X')
< '0Xcd'
> 'E{n}AMPLE'.replace(pat3, 'X')
< 'EXAMPLE'
> '43+n2 ax\\y\\ze'.replace(pat3, 'X')
< '4X2 aXe'
4) Replace the backspace character \b
with a single space character for the given input string.
> let ip = '123\b456'
> ip.replace(/\x08/, ' ')
< '123 456'
5) Replace all occurrences of \e
with e
.
> let ip = 'th\\er\\e ar\\e common asp\\ects among th\\e alt\\ernations'
> ip.replace(/\\e/g, 'e')
< 'there are common aspects among the alternations'
6) Replace any matching item from the array eqns
with X
for given the string ip
. Match the items from eqns
literally.
> let ip = '3-(a^b)+2*(a^b)-(a/b)+3'
> let eqns = ['(a^b)', '(a/b)', '(a^b)+2']
// note that '/' is also escaped here
> function escapeRegExp(string) {
return string.replace(/[.*+\-?^${}()|[\]\\\/]/g, '\\$&')
}
> function unionRegExp(arr) {
return arr.map(w => escapeRegExp(w)).join('|')
}
> eqns.sort((a, b) => b.length - a.length)
< ['(a^b)+2', '(a^b)', '(a/b)']
> const pat4 = new RegExp(unionRegExp(eqns), 'g')
> pat4
< /\(a\^b\)\+2|\(a\^b\)|\(a\/b\)/g
> ip.replace(pat4, 'X')
< '3-X*X-X+3'
Dot metacharacter and Quantifiers
Use
s
flag for these exercises depending upon the contents of the input strings.
1) Replace 42//5
or 42/5
with 8
for the given input.
> let ip = 'a+42//5-c pressure*3+42/5-14256'
> ip.replace(/42\/\/?5/g, '8')
< 'a+8-c pressure*3+8-14256'
2) For the array items
, filter all elements starting with hand
and ending immediately with at most one more character or le
.
> let items = ['handed', 'hand', 'handled', 'handy', 'unhand', 'hands', 'handle']
> items.filter(w => /^hand(.|le)?$/.test(w))
< ['hand', 'handy', 'hands', 'handle']
3) Use the split()
method to get the output as shown for the given input strings.
> let eqn1 = 'a+42//5-c'
> let eqn2 = 'pressure*3+42/5-14256'
> let eqn3 = 'r*42-5/3+42///5-42/53+a'
> const pat1 = new RegExp(`42//?5`)
> eqn1.split(pat1)
< ['a+', '-c']
> eqn2.split(pat1)
< ['pressure*3+', '-14256']
> eqn3.split(pat1)
< ['r*42-5/3+42///5-', '3+a']
4) For the given input strings, remove everything from the first occurrence of i
till the end of the string.
> let s1 = 'remove the special meaning of such constructs'
> let s2 = 'characters while constructing'
> let s3 = 'input output'
> const pat2 = /i.*/
> s1.replace(pat2, '')
< 'remove the spec'
> s2.replace(pat2, '')
< 'characters wh'
> s3.replace(pat2, '')
< ''
5) For the given strings, construct a regexp to get the output as shown.
> let str1 = 'a+b(addition)'
> let str2 = 'a/b(division) + c%d(#modulo)'
> let str3 = 'Hi there(greeting). Nice day(a(b)'
> const remove_parentheses = /\(.*?\)/g
> str1.replace(remove_parentheses, '')
< 'a+b'
> str2.replace(remove_parentheses, '')
< 'a/b + c%d'
> str3.replace(remove_parentheses, '')
< 'Hi there. Nice day'
6) Correct the given regexp to get the expected output.
> let words = 'plink incoming tint winter in caution sentient'
// wrong output
> const w1 = /int|in|ion|ing|inco|inter|ink/g
> words.replace(w1, 'X')
"plXk XcomXg tX wXer X cautX sentient"
// expected output
> const w2 = /in(ter|co|g|k|t)?|ion/g
> words.replace(w2, 'X')
"plX XmX tX wX X cautX sentient"
7) For the given greedy quantifiers, what would be the equivalent form using the {m,n}
representation?
?
is same as{0,1}
*
is same as{0,}
+
is same as{1,}
8) (a*|b*)
is same as (a|b)*
— true or false?
False. Because (a*|b*)
will match only sequences like a
, aaa
, bb
, bbbbbbbb
. But (a|b)*
can match mixed sequences like ababbba
too.
9) For the given input strings, remove everything from the first occurrence of test
(irrespective of case) till the end of the string, provided test
isn't at the end of the string.
> let s1 = 'this is a Test'
> let s2 = 'always test your regexp for corner\ncases'
> let s3 = 'a TEST of skill tests?'
> let pat3 = /test.+/is
> s1.replace(pat3, '')
< 'this is a Test'
> s2.replace(pat3, '')
< 'always '
> s3.replace(pat3, '')
< 'a '
10) For the input array words
, filter all elements starting with s
and containing e
and t
in any order.
> let words = ['sequoia', 'subtle', 'exhibit', 'a set', 'sets', 'tests', 'site']
> words.filter(w => /^s.*(e.*t|t.*e)/.test(w))
< ['subtle', 'sets', 'site']
11) For the input array words
, remove all elements having less than 6
characters.
> let words = ['sequoia', 'subtle', 'exhibit', 'asset', 'sets', 'tests', 'site']
> words.filter(w => /.{6,}/.test(w))
< ['sequoia', 'subtle', 'exhibit']
12) For the input array words
, filter all elements starting with s
or t
and having a maximum of 6
characters.
> let words = ['sequoia', 'subtle', 'exhibit', 'asset', 'sets', 't set', 'site']
> words.filter(w => /^(s|t).{0,5}$/.test(w))
< ['subtle', 'sets', 't set', 'site']
13) Delete from the string start
if it is at the beginning of a line up to the next occurrence of the string end
at the end of a line. Match these keywords irrespective of case.
> let para = `good start
start working on that
project you always wanted
to, do not let it end
hi there
start and end the end
42
Start and try to
finish the End
bye`
> const mpat = /^start.*?end$/igms
> console.log(para.replace(mpat, ''))
< good start
hi there
42
bye
14) Can you reason out why this code results in the output shown? The aim was to remove all <characters>
patterns but not the <>
ones. The expected result was 'a 1<> b 2<> c'
.
The use of .+
quantifier after <
means that <>
cannot be a possible match to satisfy <.+?>
. So, after matching <
(which occurs after 1
and 2
in the given input string) the regular expression engine will look for next occurrence of >
character to satisfy the given pattern. To solve such cases, you need to use character classes (discussed in a later chapter) to specify which particular set of characters should be matched by the +
quantifier (instead of the .
metacharacter).
> let ip = 'a<apple> 1<> b<bye> 2<> c<cat>'
> ip.replace(/<.+?>/g, '')
< 'a 1 2'
15) Use the split()
method to get the output as shown below for the given input strings.
> let s1 = 'go there :: this :: that'
> let s2 = 'a::b :: c::d e::f :: 4::5'
> let s3 = '42:: hi::bye::see :: carefully'
> const pat4 = / +:: +(.+)/
> s1.split(pat4, 2)
< ['go there', 'this :: that']
> s2.split(pat4, 2)
< ['a::b', 'c::d e::f :: 4::5']
> s3.split(pat4, 2)
< ['42:: hi::bye::see', 'carefully']
Working with matched portions
1) For the given strings, extract the matching portion from the first is
to the last t
.
> let str1 = 'What is the biggest fruit you have seen?'
> let str2 = 'Your mission is to read and practice consistently'
> const pat1 = /is.*t/
> str1.match(pat1)[0]
< 'is the biggest fruit'
> str2.match(pat1)[0]
< 'ission is to read and practice consistent'
2) Find the starting index of the first occurrence of is
or the
or was
or to
for the given input strings. Assume that there will be at least one match for each input string.
> let s1 = 'match after the last newline character'
> let s2 = 'and then you want to test'
> let s3 = 'this is good bye then'
> let s4 = 'who was there to see?'
> const pat2 = /is|the|was|to/
> s1.search(pat2)
< 12
> s2.search(pat2)
< 4
> s3.search(pat2)
< 2
> s4.search(pat2)
< 4
3) Find the starting index of the last occurrence of is
or the
or was
or to
for the given input strings. Assume that there will be at least one match for each input string.
> let s1 = 'match after the last newline character'
> let s2 = 'and then you want to test'
> let s3 = 'this is good bye then'
> let s4 = 'who was there to see?'
> const pat3 = /.*(is|the|was|to)/d
> s1.match(pat3).indices[1][0]
< 12
> s2.match(pat3).indices[1][0]
< 18
> s3.match(pat3).indices[1][0]
< 17
> s4.match(pat3).indices[1][0]
< 14
4) The given input string contains :
exactly once. Extract all characters after the :
as output.
> let ip = 'fruits:apple, mango, guava, blueberry'
> ip.match(/:(.*)/)[1]
< 'apple, mango, guava, blueberry'
5) Extract all words between (
and )
from the given input string as an array (including the parentheses). Assume that the input will not contain any broken parentheses.
> let ip = 'another (way) to reuse (portion) matched (by) capture groups'
> ip.match(/\(.*?\)/g)
< ['(way)', '(portion)', '(by)']
6) Extract all occurrences of <
up to the next occurrence of >
, provided there is at least one character in between <
and >
.
> let ip = 'a<apple> 1<> b<bye> 2<> c<cat>'
> ip.match(/<.+?>/g)
< ['<apple>', '<> b<bye>', '<> c<cat>']
7) Use matchAll()
to get the output as shown below for the given input strings. Note the characters used in the input strings carefully.
> let row1 = '-2,5 4,+3 +42,-53 4356246,-357532354 '
> let row2 = '1.32,-3.14 634,5.63 63.3e3,9907809345343.235 '
> const pat4 = /(.+?),(.+?) /g
> Array.from(row1.matchAll(pat4), m => [m[1], m[2]])
< (4) [Array(2), Array(2), Array(2), Array(2)]
0: (2) ['-2', '5']
1: (2) ['4', '+3']
2: (2) ['+42', '-53']
3: (2) ['4356246', '-357532354']
length: 4
[[Prototype]]: Array(0)
> Array.from(row2.matchAll(pat4), m => [m[1], m[2]])
< (3) [Array(2), Array(2), Array(2)]
0: (2) ['1.32', '-3.14']
1: (2) ['634', '5.63']
2: (2) ['63.3e3', '9907809345343.235']
length: 3
[[Prototype]]: Array(0)
8) This is an extension to the previous question. Sum each pair of numbers that are separated by a comma.
- For
row1
, find the sum of integers. For example, sum of-2
and5
is3
. - For
row2
, find the sum of floating-point numbers. For example, sum of1.32
and-3.14
is-1.82
.
> let row1 = '-2,5 4,+3 +42,-53 4356246,-357532354 '
> let row2 = '1.32,-3.14 634,5.63 63.3e3,9907809345343.235 '
// should be same as the previous question
> const pat5 = /(.+?),(.+?) /g
> Array.from(row1.matchAll(pat5), m => +m[1] + +m[2])
< [3, 7, -11, -353176108]
> Array.from(row2.matchAll(pat5), m => +m[1] + +m[2])
< [-1.82, 639.63, 9907809408643.234]
9) Use the split()
method to get the output as shown below.
> let ip = '42:no-output;1000:car-tr:u-ck;SQEX49801'
> ip.split(/:.+?-(.+?);/)
< ['42', 'output', '1000', 'tr:u-ck', 'SQEX49801']
10) Write a string function that changes the given input to alternate case. The first alphabet should be changed to lowercase, the next one to uppercase and then lowercase and so on. Characters other than alphabets should be left alone and not affect case changing.
> function aLtErNaTeCaSe(ip) {
let b = true
return ip.replace(/[a-z]/ig, m => (b = !b) ? m.toUpperCase() : m.toLowerCase())
}
> aLtErNaTeCaSe('HI THERE!')
< 'hI tHeRe!'
> aLtErNaTeCaSe('good morning')
< 'gOoD mOrNiNg'
> aLtErNaTeCaSe('Sample123string42with777numbers')
< 'sAmPlE123sTrInG42wItH777nUmBeRs'
11) Replace all occurrences of par
with spar
, spare
with extra
and park
with garden
.
> let s1 = 'apartment has a park'
> let s2 = 'do you have a spare cable'
> let s3 = 'write a parser'
> let d1 = {'par': 'spar', 'spare': 'extra', 'park': 'garden'}
> const pat6 = /spare|park?/g
> s1.replace(pat6, k => d1[k])
< 'aspartment has a garden'
> s2.replace(pat6, k => d1[k])
< 'do you have a extra cable'
> s3.replace(pat6, k => d1[k])
< 'write a sparser'
12) Name the flag and property you can use with the match()
method to get both the starting and ending locations of the matched portions.
The d
flag and indices
property can be used to get both the starting and ending locations of the matched portions. Here's an example:
> 'coffee:100g tea:250g'.match(/:(.*?)g/d)
< [':100g', '100', index: 6, input: 'coffee:100g tea:250g',
groups: undefined, indices: Array(2)]
// locations for the entire match
> 'coffee:100g tea:250g'.match(/:(.*?)g/d).indices[0]
< [6, 11]
// locations for the first capture group
> 'coffee:100g tea:250g'.match(/:(.*?)g/d).indices[1]
< [7, 10]
Character class
1) For the array items
, filter all elements starting with hand
and ending with s
or y
or le
. No other character in between, for example, hands
should match but not hand-has
.
> let items = ['-handy', 'hand', 'handy', 'unhand', 'hands', 'hand-icy', 'handle']
> items.filter(w => /^hand([sy]|le)$/.test(w))
< ['handy', 'hands', 'handle']
2) Replace all whole words reed
or read
or red
with X
.
> let ip = 'redo red credible :read: rod reed bred'
> ip.replace(/\bre[ae]?d\b/g, 'X')
< 'redo X credible :X: rod X bred'
3) For the array words
, filter all elements containing e
or i
followed by l
or n
. Note that the order mentioned should be followed.
> let words = ['surrender', 'unicorn', 'newer', 'door', 'empty', 'eel', 'pest']
> words.filter(w => /[ei].*[ln]/.test(w))
< ['surrender', 'unicorn', 'eel']
4) For the array words
, filter all elements containing e
or i
and l
or n
in any order.
> let words = ['surrender', 'unicorn', 'newer', 'door', 'empty', 'eel', 'pest']
> words.filter(w => /[ei].*[ln]|[ln].*[ei]/.test(w))
< ['surrender', 'unicorn', 'newer', 'eel']
5) Extract all hex character sequences, with 0x
optional prefix. Match the characters case insensitively, and the sequences shouldn't be surrounded by other word characters.
> let str1 = '128A foo 0xfe32 34 0xbar'
> let str2 = '0XDEADBEEF place 0x0ff1ce bad'
> const hex_seq = /\b(0x)?[\da-f]+\b/ig
> str1.match(hex_seq)
< ['128A', '0xfe32', '34']
> str2.match(hex_seq)
< ['0XDEADBEEF', '0x0ff1ce', 'bad']
6) Delete from (
to the next occurrence of )
unless they contain parentheses characters in between.
> let str1 = 'def factorial()'
> let str2 = 'a/b(division) + c%d(#modulo) - (e+(j/k-3)*4)'
> let str3 = 'Hi there(greeting). Nice day(a(b)'
> const remove_parentheses = /\([^()]*\)/g
> str1.replace(remove_parentheses, '')
< 'def factorial'
> str2.replace(remove_parentheses, '')
< 'a/b + c%d - (e+*4)'
> str3.replace(remove_parentheses, '')
< 'Hi there. Nice day(a'
7) For the array words
, filter all elements not starting with e
or p
or u
.
> let words = ['surrender', 'unicorn', 'newer', 'door', 'empty', 'eel', '(pest)']
> words.filter(w => /^[^epu]/.test(w))
< ['surrender', 'newer', 'door', '(pest)']
8) For the array words
, filter all elements not containing u
or w
or ee
or -
.
> let words = ['p-t', 'you', 'tea', 'heel', 'owe', 'new', 'reed', 'ear']
> words.filter(w => !/[uw-]|ee/.test(w))
< ['tea', 'ear']
9) The given input strings contain fields separated by ,
and fields can be empty too. Replace the last three fields with WHTSZ323
.
> let row1 = '(2),kite,12,,D,C,,'
> let row2 = 'hi,bye,sun,moon'
> const pat1 = /(,[^,]*){3}$/g
> row1.replace(pat1, ',WHTSZ323')
< '(2),kite,12,,D,WHTSZ323'
> row2.replace(pat1, ',WHTSZ323')
< 'hi,WHTSZ323'
10) Split the given strings based on consecutive sequence of digit or whitespace characters.
> let s1 = 'lion \t Ink32onion Nice'
> let s2 = '**1\f2\n3star\t7 77\r**'
> const pat2 = /[\d\s]+/
> s1.split(pat2)
< ['lion', 'Ink', 'onion', 'Nice']
> s2.split(pat2)
< ['**', 'star', '**']
11) Delete all occurrences of the sequence <characters>
where characters
is one or more non >
characters and cannot be empty.
> let ip = 'a<apple> 1<> b<bye> 2<> c<cat>'
> ip.replace(/<[^>]+>/g, '')
< 'a 1<> b 2<> c'
12) \b[a-z](on|no)[a-z]\b
is same as \b[a-z][on]{2}[a-z]\b
. True or False? Sample input lines shown below might help to understand the differences, if any.
False. [on]{2}
will also match oo
and nn
.
> console.log('known\nmood\nknow\npony\ninns')
known
mood
know
pony
inns
13) For the given array, filter elements containing any number sequence greater than 624
.
> let items = ['hi0000432abcd', 'car00625', '42_624 0512', '3.14 96 2 foo1234baz']
> items.filter(e => e.match(/\d+/g).some(m => m > 624))
< ['car00625', '3.14 96 2 foo1234baz']
14) Convert the given input string to two different arrays as shown below.
> let ip = 'price_42 roast^\t\n^-ice==cat\neast'
> ip.split(/\W+/)
< ['price_42', 'roast', 'ice', 'cat', 'east']
> ip.split(/(\W+)/)
< ['price_42', ' ', 'roast', '^\t\n^-', 'ice', '==', 'cat', '\n', 'east']
15) Filter all elements whose first non-whitespace character is not a #
character. Any element made up of only whitespace characters should be ignored as well.
> let items = [' #comment', '\t\napple #42', '#oops', 'sure', 'no#1', '\t\r\f']
> items.filter(e => /^\s*[^#\s]/.test(e))
< ['\t\napple #42', 'sure', 'no#1']
16) For the given string, surround all whole words with {}
except par
and cat
.
> let ip = 'part; cat {super} rest_42 par scatter'
> ip.replace(/\w+/g, w => /\b(par|cat)\b/.test(w) ? w : `{${w}}`)
< '{part}; cat {{super}} {rest_42} par {scatter}'
Groupings and backreferences
1) Replace the space character that occurs after a word ending with a
or r
with a newline character.
> let ip = 'area not a _a2_ roar took 22'
> console.log(ip.replace(/([ar]) /g, '$1\n'))
area
not a
_a2_ roar
took 22
2) Add []
around words starting with s
and containing e
and t
in any order.
> let ip = 'sequoia subtle exhibit asset sets2 tests si_te'
> ip.replace(/\bs\w*(t\w*e|e\w*t)\w*/g, '[$&]')
< 'sequoia [subtle] exhibit asset [sets2] tests [si_te]'
3) Replace all whole words with X
that start and end with the same word character (irrespective of case). Single character word should get replaced with X
too, as it satisfies the stated condition.
> let ip = 'oreo not a _a2_ Roar took 22'
// can also use: ip.replace(/\b(\w|(\w)\w*\2)\b/ig, 'X')
> ip.replace(/\b(\w)(\w*\1)?\b/ig, 'X')
< 'X not X X X took X'
4) Convert the given markdown headers to corresponding anchor tags. Consider the input to start with one or more #
characters followed by space and word characters. The name
attribute is constructed by converting the header to lowercase and replacing spaces with hyphens. Can you do it without using a capture group?
> let header1 = '# Regular Expressions'
> let header2 = '## Named capture groups'
> function hyphenify(m) {
return `<a name='${m.toLowerCase().replace(/ /g, "-")}'></a>${m}`
}
> header1.replace(/\w.*/, hyphenify)
< "# <a name='regular-expressions'></a>Regular Expressions"
> header2.replace(/\w.*/, hyphenify)
< "## <a name='named-capture-groups'></a>Named capture groups"
5) Convert the given markdown anchors to corresponding hyperlinks.
> let anchor1 = "# <a name='regular-expressions'></a>Regular Expressions"
> let anchor2 = "## <a name='subexpression-calls'></a>Subexpression calls"
> const hyperlink = /[^']+'([^']+)'><\/a>(.+)/
> anchor1.replace(hyperlink, '[$2](#$1)')
< '[Regular Expressions](#regular-expressions)'
> anchor2.replace(hyperlink, '[$2](#$1)')
< '[Subexpression calls](#subexpression-calls)'
6) Check if the given input strings have words with at least two consecutive repeated alphabets irrespective of case. For example, words like stillnesS
and Committee
should return true
but words like root
or readable
or rotational
should return false
. Consider word to be as defined in regular expression parlance.
> let s1 = 'readable COMMItTEe'
> let s2 = 'rotational sti1lness _foot_'
> let s3 = 'needed repeated'
> let s4 = 'offsh00t'
> const pat1 = /(?:(\w)\1\w*){2}/i
> pat1.test(s1)
true
> pat1.test(s2)
false
> pat1.test(s3)
false
> pat1.test(s4)
true
7) For the given input string, replace all occurrences of digit sequences with only the unique non-repeating sequence. For example, 232323
should be changed to 23
and 897897
should be changed to 897
. If there are no repeats (for example 1234
) or if the repeats end prematurely (for example 12121
), it should not be changed.
> let ip = '1234 2323 453545354535 9339 11 60260260'
> ip.replace(/\b(\d+)\1+\b/g, '$1')
< '1234 23 4535 9339 1 60260260'
8) Replace sequences made up of words separated by :
or .
by the first word of the sequence. Such sequences will end when :
or .
is not followed by a word character.
> let ip = 'wow:Good:2_two.five: hi-2 bye kite.777:water.'
> ip.replace(/([:.]\w*)+/g, '')
< 'wow hi-2 bye kite'
9) Replace sequences made up of words separated by :
or .
by the last word of the sequence. Such sequences will end when :
or .
is not followed by a word character.
> let ip = 'wow:Good:2_two.five: hi-2 bye kite.777:water.'
> ip.replace(/((\w+)[:.])+/g, '$2')
< 'five hi-2 bye water'
10) Split the given input string on one or more repeated sequence of cat
.
> let ip = 'firecatlioncatcatcatbearcatcatparrot'
> ip.split(/(?:cat)+/)
< ['fire', 'lion', 'bear', 'parrot']
11) For the given input string, find all occurrences of digit sequences with at least one repeating sequence. For example, 232323
and 897897
. If the repeats end prematurely, for example 12121
, it should not be matched.
> let ip = '1234 2323 453545354535 9339 11 60260260'
> const pat2 = /\b(\d+)\1+\b/g
// entire sequences in the output
> ip.match(pat2)
< ['2323', '453545354535', '11']
// only the unique sequence in the output
> Array.from(ip.matchAll(pat2), m => m[1])
< ['23', '4535', '1']
12) Convert the comma separated strings to corresponding key-value pair mapping as shown below. The keys are name
, maths
and phy
for the three fields in the input strings.
> let row1 = 'rohan,75,89'
> let row2 = 'rose,88,92'
> const pat3 = /(?<name>[^,]+),(?<maths>[^,]+),(?<phy>[^,]+)/
> row1.match(pat3).groups
< {name: 'rohan', maths: '75', phy: '89'}
> row2.match(pat3).groups
< {name: 'rose', maths: '88', phy: '92'}
13) Surround all whole words with ()
. Additionally, if the whole word is imp
or ant
, delete them. Can you do it with just a single substitution?
> let ip = 'tiger imp goat eagle ant important'
> ip.replace(/\b(?:imp|ant|(\w+))\b/g, '($1)')
< '(tiger) () (goat) (eagle) () (important)'
Lookarounds
Use lookarounds for solving the following exercises even if they are not required.
1) Replace all whole words with X
unless it is preceded by a (
character.
> let ip = '(apple) guava berry) apple (mango) (grape'
> ip.replace(/(?<!\()\b\w+/g, 'X')
< '(apple) X X) X (mango) (grape'
2) Replace all whole words with X
unless it is followed by a )
character.
> let ip = '(apple) guava berry) apple (mango) (grape'
> ip.replace(/\w+\b(?!\))/g, 'X')
< '(apple) X berry) X (mango) (X'
3) Replace all whole words with X
unless it is preceded by (
or followed by )
characters.
> let ip = '(apple) guava berry) apple (mango) (grape'
> ip.replace(/(?<!\()\b\w+\b(?!\))/g, 'X')
< '(apple) X berry) X (mango) (grape'
4) Extract all whole words that do not end with e
or n
.
> let ip = 'a_t row on Urn e note Dust n end a2-e|u'
> ip.match(/\b\w+\b(?<![en])/g)
< ['a_t', 'row', 'Dust', 'end', 'a2', 'u']
5) Extract all whole words that do not start with a
or d
or n
.
> let ip = 'a_t row on Urn e note Dust n end a2-e|u'
> ip.match(/(?![adn])\b\w+/g)
< ['row', 'on', 'Urn', 'e', 'Dust', 'end', 'e', 'u']
6) Extract all whole words only if they are followed by :
or ,
or -
.
> let ip = 'Poke,on=-=so_good:ink.to/is(vast)ever2-sit'
> ip.match(/\w+(?=[:,-])/g)
< ['Poke', 'so_good', 'ever2']
7) Extract all whole words only if they are preceded by =
or /
or -
.
> let ip = 'Poke,on=-=so_good:ink.to/is(vast)ever2-sit'
> ip.match(/(?<=[=\/-])\w+/g)
< ['so_good', 'is', 'sit']
8) Extract all whole words only if they are preceded by =
or :
and followed by :
or .
.
> let ip = 'Poke,on=-=so_good:ink.to/is(vast)ever2-sit'
> ip.match(/(?<=[=:])\w+(?=[:.])/g)
< ['so_good', 'ink']
9) Extract all whole words only if they are preceded by =
or :
or .
or (
or -
and not followed by .
or /
.
> let ip = 'Poke,on=-=so_good:ink.to/is(vast)ever2-sit'
> ip.match(/(?<=[=:.(-])\w+\b(?![\/.])/g)
< ['so_good', 'vast', 'sit']
10) Remove the leading and trailing whitespaces from all the individual fields where ,
is the field separator.
> let csv1 = ' comma ,separated ,values \t\r '
> let csv2 = 'good bad,nice ice , 42 , , stall small'
> const trim_whitespace = /(?<=^|,)\s+|\s+(?=,|$)/g
> csv1.replace(trim_whitespace, '')
< 'comma,separated,values'
> csv2.replace(trim_whitespace, '')
< 'good bad,nice ice,42,,stall small'
11) Filter elements that satisfy all of these rules:
- should have at least two alphabets
- should have at least three digits
- should have at least one special character among
%
or*
or#
or$
- should not end with a whitespace character
> let pwds = ['hunter2', 'F2h3u%9', '*X3Yz3.14\t', 'r2_d2_42', 'A $B C1234']
> pwds.filter(p => /(?!.*\s$)(?=(.*[a-z]){2})(?=(.*\d){3}).*[%*#$]/i.test(p))
< ['F2h3u%9', 'A $B C1234']
12) For the given string, surround all whole words with {}
except for whole words par
and cat
and apple
.
> let ip = 'part; cat {super} rest_42 par scatter apple spar'
> ip.replace(/\b(?!(?:par|cat|apple)\b)\w+/g, '{$&}')
< '{part}; cat {{super}} {rest_42} par {scatter} apple {spar}'
13) Extract the integer portion of floating-point numbers for the given string. A number ending with .
and no further digits should not be considered.
> let ip = '12 ab32.4 go 5 2. 46.42 5'
> ip.match(/\d+(?=\.\d)/g)
< ['32', '46']
14) For the given input strings, extract all overlapping two character sequences.
> let s1 = 'apple'
> let s2 = '1.2-3:4'
> const pat1 = /.(?=(.))/g
> Array.from(s1.matchAll(pat1), m => m[0] + m[1])
< ['ap', 'pp', 'pl', 'le']
> Array.from(s2.matchAll(pat1), m => m[0] + m[1])
< ['1.', '.2', '2-', '-3', '3:', ':4']
15) The given input strings contain fields separated by the :
character. Delete :
and the last field if there is a digit character anywhere before the last field.
> let s1 = '42:cat'
> let s2 = 'twelve:a2b'
> let s3 = 'we:be:he:0:a:b:bother'
> let s4 = 'apple:banana-42:cherry:'
> let s5 = 'dragon:unicorn:centaur'
> const pat2 = /(?<=\d.*):[^:]*$/
> s1.replace(pat2, '')
< '42'
> s2.replace(pat2, '')
< 'twelve:a2b'
> s3.replace(pat2, '')
< 'we:be:he:0:a:b'
> s4.replace(pat2, '')
< 'apple:banana-42:cherry'
> s5.replace(pat2, '')
< 'dragon:unicorn:centaur'
16) Extract all whole words unless they are preceded by :
or <=>
or ----
or #
.
> let ip = '::very--at<=>row|in.a_b#b2c=>lion----east'
> ip.match(/(?<![:#]|<=>|-{4})\b\w+/g)
< ['at', 'in', 'a_b', 'lion']
17) Match strings if it contains qty
followed by price
but not if there is any whitespace character or the string error
between them.
> let str1 = '23,qty,price,42'
> let str2 = 'qty price,oh'
> let str3 = '3.14,qty,6,errors,9,price,3'
> let str4 = '42\nqty-6,apple-56,price-234,error'
> let str5 = '4,price,3.14,qty,4'
> let str6 = '(qtyprice) (hi-there)'
> const neg = /qty((?!\s|error).)*price/
> neg.test(str1)
< true
> neg.test(str2)
< false
> neg.test(str3)
< false
> neg.test(str4)
< true
> neg.test(str5)
< false
> neg.test(str6)
< true
18) Can you reason out why the following regular expressions behave differently?
\b
matches both the start and end of word locations. In the below example, \b..\b
doesn't necessarily mean that the first \b
will match only the start of word location and the second \b
will match only the end of word location. They can be any combination! For example, I
followed by space in the input string here is using the start of word location for both the conditions. Similarly, space followed by 2
is using the end of word location for both the conditions.
In contrast, the negative lookarounds version ensures that there are no word characters around any two characters. Also, such assertions will always be satisfied at the start of string and the end of string respectively. But \b
depends on the presence of word characters. For example, !
at the end of the input string here matches the lookaround assertion but not word boundary.
> let ip = 'I have 12, he has 2!'
> ip.replace(/\b..\b/g, '{$&}')
< '{I }have {12}{, }{he} has{ 2}!'
> ip.replace(/(?<!\w)..(?!\w)/g, '{$&}')
< 'I have {12}, {he} has {2!}'
19) Simulate string partitioning to get an array of three elements — string before the separator, portion matched by the separator and string after the separator. For the first case, split the given input string on the first occurrence of digits. For the second case, split based on the last occurrence of digits.
> let w2 = 'Sample123string42with777numbers'
> w2.split(/(?<!\d.*)(\d+)/)
< ['Sample', '123', 'string42with777numbers']
> w2.split(/(\d+)(?!.*\d)/)
< ['Sample123string42with', '777', 'numbers']
20) Find the starting index of the last occurrence of is
or the
or was
or to
for the given input strings using the search()
method. Assume that there will be at least one match for each input string.
> let s1 = 'match after the last newline character'
> let s2 = 'and then you want to test'
> let s3 = 'this is good bye then'
> let s4 = 'who was there to see?'
> const pat3 = /(is|the|was|to)(?!.*(is|the|was|to))/
> s1.search(pat3)
< 12
> s2.search(pat3)
< 18
> s3.search(pat3)
< 17
> s4.search(pat3)
< 14
Unicode
1) Check if the given input strings are made up of ASCII characters only. Consider the input to be non-empty strings and any character that isn't part of the 7-bit ASCII set should result in false
.
> let str1 = '123 × 456'
> let str2 = 'good fοοd'
> let str3 = 'happy learning!'
// can also use: const pat1 = /^[\x00-\x7f]+$/
> const pat1 = /^\p{ASCII}+$/u
> pat1.test(str1)
< false
> pat1.test(str2)
< false
> pat1.test(str3)
< true
2) Retain only the punctuation characters for the given string.
> let ip = '❨a❩❪1❫❬b❭❮2❯❰c❱❲3❳❴xyz❵⟅123⟆⟦⟧⟨like⟩⟪3.14⟫'
> ip.replace(/\P{P}+/gu, '')
< '❨❩❪❫❬❭❮❯❰❱❲❳❴❵⟅⟆⟦⟧⟨⟩⟪.⟫'
3) Is the following code snippet showing the correct output?
Yes. Some regular expression engines allow escape sequences like \d
, \b
, \s
, \w
, etc to be Unicode aware, but not JavaScript.
> 'fox:αλεπού'.match(/\w+/g)
< ['fox']
4) Name the set operations enabled by the v
flag.
The following set operations are enabled by the v
flag inside character classes:
&&
intersection--
difference
To aid in such definitions, you can use []
in nested fashion.
5) Extract all whole words from the given strings. However, do not match words if they contain any character present in the ignore
variable.
> let s1 = 'match after the last new_line character A2'
> let s2 = 'and then you want to test'
> let ignore = 'aty'
> const ign1 = new RegExp(`\\b[\\w--[${ignore}]]+\\b`, 'gv')
> ign1
< /\b[\w--[aty]]+\b/gv
> s1.match(ign1)
< ['new_line', 'A2']
> s2.match(ign1)
< null
> let ignore = 'esw'
// should be the same solution used above
> const ign2 = new RegExp(`\\b[\\w--[${ignore}]]+\\b`, 'gv')
> ign2
< /\b[\w--[esw]]+\b/gv
> s1.match(ign2)
< ['match', 'A2']
> s2.match(ign2)
< ['and', 'you', 'to']