a(b+c)d = abd+acd in maths, you get
a(b|c)d = abd|acd in regular expressions. However, you'll have to be careful if quantifiers are involved.
(a*|b*) isn't the same as
(a|b)*. Can you reason out why? Here's a railroad diagram to help you out:
The difference is that
(a*|b*) only matches same letter sequences like
aaaaaa, etc. But
(a|b)* can match mixed sequences like
ababbba too. You can also simplify
[ab]* since it is just single character alternation in this particular example.
Here's an illustration using Python:
>>> import re >>> test = ['aa', 'abbaba', 'aaabbb', 'bbbbb', 'abc'] >>> [s for s in test if re.fullmatch(r'(a*|b*)', s)] ['aa', 'bbbbb'] >>> [s for s in test if re.fullmatch(r'(a|b)*', s)] ['aa', 'abbaba', 'aaabbb', 'bbbbb']