Lookarounds help to create custom anchors and add conditions within a regex definition. These assertions are also known as zero-width patterns because they add restrictions similar to anchors and are not part of the matched portions. The syntax for negative lookarounds is shown below:

  • (?!pat) negative lookahead assertion
  • (?<!pat) negative lookbehind assertion

Here are some examples:

# change 'cat' only if it is not followed by a digit character
# note that the end of string satisfies the given assertion
# 'catcat' has two matches as the assertion doesn't consume characters
>>> re.sub(r'cat(?!\d)', 'dog', 'hey cats! cat42 cat_5 catcat')
'hey dogs! cat42 dog_5 dogdog'

# change 'cat' only if it is not preceded by _
# note how 'cat' at the start of string is matched as well
>>> re.sub(r'(?<!_)cat', 'dog', 'cat _cat 42catcat')
'dog _cat 42dogdog'

# change whole word only if it is not preceded by : or -
>>> re.sub(r'(?<![:-])\b\w+', 'X', ':cart <apple: -rest ;tea')
':cart <X: -rest ;X'

Lookarounds can be placed anywhere and multiple lookarounds can be combined in any order. They do not consume characters nor do they play a role in matched portions. They just let you know whether the condition you want to test is satisfied from the current location in the input string.

# extract all whole words that do not start with a/n
>>> ip = 'a_t row on Urn e note Dust n end a2-e|u'
>>> re.findall(r'(?![an])\b\w+', ip)
['row', 'on', 'Urn', 'e', 'Dust', 'end', 'e', 'u']

# since the three assertions used here are all zero-width,
# all of the 6 possible combinations will be equivalent
>>> re.sub(r'(?!\Z)\b(?<!\A)', ' ', 'output=num1+35*42/num2')
'output = num1 + 35 * 42 / num2'

Video demo:


info See also my 100 Page Python Intro and Understanding Python re(gex)? ebooks.