Flags

Just like options change the default behavior of shell commands, flags are used to change aspects of regular expressions. Some of the flags like g and p have been already discussed. For completeness, they will be discussed again in this chapter. In regular expression parlance, flags are also known as modifiers.

Case insensitive matching

The I flag allows to match a pattern case insensitively.

$ # match 'cat' case sensitively
$ printf 'Cat\ncOnCaT\nscatter\ncot\n' | sed -n '/cat/p'
scatter

$ # match 'cat' case insensitively
$ # note that command p cannot be used before flag I
$ printf 'Cat\ncOnCaT\nscatter\ncot\n' | sed -n '/cat/Ip'
Cat
cOnCaT
scatter

$ # match 'cat' case insensitively and replace it with 'dog'
$ printf 'Cat\ncOnCaT\nscatter\ncot\n' | sed 's/cat/dog/I'
dog
cOndog
sdogter
cot

info Usually i is used for such purposes, grep -i for example. But i is a command (discussed in append, change, insert chapter) in sed, so /REGEXP/i cannot be used. The substitute command does allow both i and I to be used, but I is recommended for consistency.

Changing case in replacement section

This section isn't actually about flags, but presented in this chapter to complement the I flag. sed provides escape sequences to change the case of replacement strings, which might include backreferences, shell variables, etc.

Escape SequenceDescription
\Eindicates end of case conversion
\lconvert next character to lowercase
\uconvert next character to uppercase
\Lconvert following characters to lowercase, unless \U or \E is used
\Uconvert following characters to uppercase, unless \L or \E is used

First up, changing case of only the immediate next character after the escape sequence.

$ # match only first character of word using word boundary
$ # use & to backreference the matched character
$ # \u would then change it to uppercase
$ echo 'hello there. how are you?' | sed 's/\b\w/\u&/g'
Hello There. How Are You?

$ # change first character of word to lowercase
$ echo 'HELLO THERE. HOW ARE YOU?' | sed 's/\b\w/\l&/g'
hELLO tHERE. hOW aRE yOU?

$ # match lowercase followed by underscore followed by lowercase
$ # delete underscore and convert 2nd lowercase to uppercase
$ echo '_foo aug_price next_line' | sed -E 's/([a-z])_([a-z])/\1\u\2/g'
_foo augPrice nextLine

Next, changing case of multiple characters at a time.

$ # change all alphabets to lowercase
$ echo 'HaVE a nICe dAy' | sed 's/.*/\L&/'
have a nice day
$ # change all alphabets to uppercase
$ echo 'HaVE a nICe dAy' | sed 's/.*/\U&/'
HAVE A NICE DAY

$ # \E will stop further conversion
$ echo '_foo aug_price next_line' | sed -E 's/([a-z]+)(_[a-z]+)/\U\1\E\2/g'
_foo AUG_price NEXT_line
$ # \L or \U will override any existing conversion
$ echo 'HeLLo:bYe gOoD:beTTEr' | sed -E 's/([a-z]+)(:[a-z]+)/\L\1\U\2/Ig'
hello:BYE good:BETTER

Finally, examples where escapes can be used next to each other.

$ # uppercase first character of a word
$ # and lowercase rest of the word characters
$ # note the order of escapes used, \u\L won't work
$ echo 'HeLLo:bYe gOoD:beTTEr' | sed -E 's/[a-z]+/\L\u&/Ig'
Hello:Bye Good:Better

$ # lowercase first character of a word
$ # and uppercase rest of the word characters
$ echo 'HeLLo:bYe gOoD:beTTEr' | sed -E 's/[a-z]+/\U\l&/Ig'
hELLO:bYE gOOD:bETTER

Global replace

As seen earlier, by default substitute command will replace only the first occurrence of search pattern. Use g flag to replace all the matches.

$ # change only first ',' to '-'
$ printf '1,2,3,4\na,b,c,d\n' | sed 's/,/-/'
1-2,3,4
a-b,c,d

$ # change all matches by adding 'g' flag
$ printf '1,2,3,4\na,b,c,d\n' | sed 's/,/-/g'
1-2-3-4
a-b-c-d

Replace specific occurrences

A number provided as a flag will cause only the Nth match to be replaced.

$ # default substitution replaces first occurrence
$ echo 'foo:123:bar:baz' | sed 's/:/-/'
foo-123:bar:baz
$ echo 'foo:123:bar:baz' | sed -E 's/[^:]+/"&"/'
"foo":123:bar:baz

$ # replace second occurrence
$ echo 'foo:123:bar:baz' | sed 's/:/-/2'
foo:123-bar:baz
$ echo 'foo:123:bar:baz' | sed -E 's/[^:]+/"&"/2'
foo:"123":bar:baz

$ # replace third occurrence and so on
$ echo 'foo:123:bar:baz' | sed 's/:/-/3'
foo:123:bar-baz
$ echo 'foo:123:bar:baz' | sed -E 's/[^:]+/"&"/3'
foo:123:"bar":baz

Quantifiers can be used to replace Nth match from the end of line.

$ # replacing last occurrence
$ # can also use sed -E 's/:([^:]*)$/[]\1/'
$ echo '456:foo:123:bar:789:baz' | sed -E 's/(.*):/\1[]/'
456:foo:123:bar:789[]baz

$ # replacing last but one
$ echo '456:foo:123:bar:789:baz' | sed -E 's/(.*):(.*:)/\1[]\2/'
456:foo:123:bar[]789:baz

$ # generic version, where {N} refers to last but N
$ echo '456:foo:123:bar:789:baz' | sed -E 's/(.*):((.*:){2})/\1[]\2/'
456:foo:123[]bar:789:baz

warning See unix.stackexchange: Why doesn't this sed command replace the 3rd-to-last "and"? for a bug related to use of word boundaries in the ((pat){N}) generic case.

A combination of number and g flag will replace all matches except the first N-1 occurrences. In other words, all matches starting from the Nth occurrence will be replaced.

$ # replace all except the first occurrence
$ echo '456:foo:123:bar:789:baz' | sed -E 's/:/[]/2g'
456:foo[]123[]bar[]789[]baz

$ # replace all except the first three occurrences
$ echo '456:foo:123:bar:789:baz' | sed -E 's/:/[]/4g'
456:foo:123:bar[]789[]baz

If multiple Nth occurrences are to be replaced, use descending order for readability.

$ # replace second and third occurrences
$ # note the numbers used
$ echo '456:foo:123:bar:789:baz' | sed 's/:/[]/2; s/:/[]/2'
456:foo[]123[]bar:789:baz

$ # better way is to use descending order
$ echo '456:foo:123:bar:789:baz' | sed 's/:/[]/3; s/:/[]/2'
456:foo[]123[]bar:789:baz

$ # replace second, third and fifth occurrences
$ echo '456:foo:123:bar:789:baz' | sed 's/:/[]/5; s/:/[]/3; s/:/[]/2'
456:foo[]123[]bar:789[]baz

This flag was already introduced in Selective editing chapter.

$ # no output if no substitution
$ echo 'hi there. have a nice day' | sed -n 's/xyz/XYZ/p'

$ # modified line is displayed if substitution succeeds
$ echo 'hi there. have a nice day' | sed -n 's/\bh/H/pg'
Hi there. Have a nice day

Write to a file

The w flag allows to redirect contents to a specified filename instead of default stdout. This flag applies to both filtering and substitution command. You might wonder why not simply use shell redirection? As sed allows multiple commands, the w flag can be used selectively, allow writes to multiple files and so on.

$ # space between w and filename is optional
$ # same as: sed -n 's/3/three/p' > 3.txt
$ seq 20 | sed -n 's/3/three/w 3.txt'
$ cat 3.txt
three
1three

$ # do not use -n if output should be displayed as well as written to file
$ printf '1,2,3,4\na,b,c,d\n' | sed 's/,/:/gw cols.txt'
1:2:3:4
a:b:c:d
$ cat cols.txt
1:2:3:4
a:b:c:d

For multiple output files, use -e for each file. Don't use ; between commands as that will be interpreted as part of the filename!

$ seq 20 | sed -n -e 's/5/five/w 5.txt' -e 's/7/seven/w 7.txt'
$ cat 5.txt
five
1five
$ cat 7.txt
seven
1seven

There are two predefined filenames:

  • /dev/stdout to write to stdout
  • /dev/stderr to write to stderr
$ # in-place editing as well as display changes on stdout
$ sed -i 's/three/3/w /dev/stdout' 3.txt
3
13
$ cat 3.txt
3
13

Executing external commands

The e flag allows to use output of a shell command. The external command can be based on the pattern space contents or provided as an argument. Quoting from the manual:

This command allows one to pipe input from a shell command into pattern space. Without parameters, the e command executes the command that is found in pattern space and replaces the pattern space with the output; a trailing newline is suppressed.

If a parameter is specified, instead, the e command interprets it as a command and sends its output to the output stream. The command can run across multiple lines, all but the last ending with a back-slash.

In both cases, the results are undefined if the command to be executed contains a NUL character.

First, examples with substitution command.

$ # sample input
$ printf 'Date:\nreplace this line\n'
Date:
replace this line

$ # replacing entire line with output of shell command
$ printf 'Date:\nreplace this line\n' | sed 's/^replace.*/date/e'
Date:
Wed Aug 14 11:39:39 IST 2019

If the p flag is used as well, order is important. Quoting from the manual:

when both the p and e options are specified, the relative ordering of the two produces very different results. In general, ep (evaluate then print) is what you want, but operating the other way round can be useful for debugging. For this reason, the current version of GNU sed interprets specially the presence of p options both before and after e, printing the pattern space before and after evaluation, while in general flags for the s command show their effect just once. This behavior, although documented, might change in future versions.

$ printf 'Date:\nreplace this line\n' | sed -n 's/^replace.*/date/ep'
Wed Aug 14 11:42:48 IST 2019

$ printf 'Date:\nreplace this line\n' | sed -n 's/^replace.*/date/pe'
date

If only a portion of the line is replaced, complete modified line after substitution will get executed as a shell command.

$ # after substitution, the command that gets executed is 'seq 5'
$ echo 'xyz 5' | sed 's/xyz/seq/e'
1
2
3
4
5

Next, examples with filtering alone.

$ # execute entire matching line as a shell command
$ # replaces the matching line with output of the command
$ printf 'date\ndate -I\n' | sed '/date/e'
Wed Aug 14 11:51:06 IST 2019
2019-08-14
$ printf 'date\ndate -I\n' | sed '2e'
date
2019-08-14

$ # command provided as argument, output is inserted before matching line
$ printf 'show\nexample\n' | sed '/am/e seq 2'
show
1
2
example

Multiline mode

The m (or M) flag will change the behavior of ^, $ and . metacharacters. This comes into play only if there are multiple lines in the pattern space to operate with, for example when the N command is used.

If m flag is used, the . metacharacter will not match the newline character.

$ # without 'm' flag . will match newline character
$ printf 'Hi there\nHave a Nice Day\n' | sed 'N; s/H.*e/X/'
X Day

$ # with 'm' flag . will not match across lines
$ printf 'Hi there\nHave a Nice Day\n' | sed 'N; s/H.*e/X/gm'
X
X Day

The ^ and $ anchors will match every line's start and end locations when m flag is used.

$ # without 'm' flag line anchors will match once for whole string
$ printf 'Hi there\nHave a Nice Day\n' | sed 'N; s/^/* /g'
* Hi there
Have a Nice Day
$ printf 'Hi there\nHave a Nice Day\n' | sed 'N; s/$/./g'
Hi there
Have a Nice Day.

$ # with 'm' flag line anchors will work for every line
$ printf 'Hi there\nHave a Nice Day\n' | sed 'N; s/^/* /gm'
* Hi there
* Have a Nice Day
$ printf 'Hi there\nHave a Nice Day\n' | sed 'N; s/$/./gm'
Hi there.
Have a Nice Day.

The \` and \' anchors will always match the start and end of entire string, irrespective of single or multiline mode.

$ # similar to \A start of string anchor found in other implementations
$ printf 'Hi there\nHave a Nice Day\n' | sed 'N; s/\`/* /gm'
* Hi there
Have a Nice Day

$ # similar to \Z end of string anchor found in other implementations
$ # note the use of double quotes
$ # with single quotes, it will be: sed 'N; s/\'\''/./gm'
$ printf 'Hi there\nHave a Nice Day\n' | sed "N; s/\'/./gm"
Hi there
Have a Nice Day.

Usually, regular expression implementations have separate flags to control the behavior of . metacharacter and line anchors. Having a single flag restricts flexibility. As an example, you cannot make . to match across lines if m flag is used in sed. You'll have to resort to some creative alternatives in such cases as shown below.

$ # \w|\W or .|\n can also be used
$ # recall that sed doesn't allow character set sequences inside []
$ printf 'Hi there\nHave a Nice Day\n' | sed -E 'N; s/H(\s|\S)*e/X/m'
X Day

$ # this one doesn't use alternation
$ printf 'Hi there\nHave a Nice Day\n' | sed -E 'N; s/H(.*\n.*)*e/X/m'
X Day

Cheatsheet and summary

NoteDescription
flagchanges default behavior of REGEXP
Imatch case insensitively for REGEXP address
i or Imatch case insensitively for substitution command
\Eindicates end of case conversion in replacement section
\lconvert next character to lowercase
\uconvert next character to uppercase
\Lconvert following characters to lowercase, unless \U or \E is used
\Uconvert following characters to uppercase, unless \L or \E is used
greplace all occurrences instead of just the first match
Na number will cause only the Nth match to be replaced
pprints line only if substitution succeeds (assuming -n is active)
w filenamewrite contents of pattern space to given filename
whenever the REGEXP address matches or substitution succeeds
eexecutes contents of pattern space as shell command
and replaces the pattern space with command output
if argument is passed, executes that external command
and inserts output before matching lines
m or Mmultiline mode flag
. will not match the newline character
^ and $ will match every line's start and end locations
\`always match the start of string irrespective of m flag
\'always match the end of string irrespective of m flag

This chapter showed how flags can be used for extra functionality. Some of the flags interact with the shell as well. In the next chapter, you'll learn how to incorporate shell variables and command outputs to dynamically construct a sed command.

Exercises

a) For the input file para.txt, remove all groups of lines marked with a line beginning with start and a line ending with end. Match both these markers case insensitively.

$ cat para.txt
good start
Start working on that
project you always wanted
to, do not let it end
hi there
start and try to
finish the End
bye

$ sed ##### add your solution here
good start
hi there
bye

b) The given sample input below starts with one or more # characters followed by one or more whitespace characters and then some words. Convert such strings to corresponding output as shown below.

$ echo '# Regular Expressions' | sed ##### add your solution here
regular-expressions
$ echo '## Compiling regular expressions' | sed ##### add your solution here
compiling-regular-expressions

c) Using the input file para.txt, create a file named five.txt with all lines that contain a whole word of length 5 and a file named six.txt with all lines that contain a whole word of length 6.

$ sed ##### add your solution here

$ cat five.txt
good start
Start working on that
hi there
start and try to
$ cat six.txt
project you always wanted
finish the End

d) Given sample strings have fields separated by , where field values can be empty as well. Use sed to replace the third field with 42.

$ echo 'lion,,ant,road,neon' | sed ##### add your solution here
lion,,42,road,neon

$ echo ',,,' | sed ##### add your solution here
,,42,

e) Replace all occurrences of e with 3 except the first two matches.

$ echo 'asset sets tests site' | sed ##### add your solution here
asset sets t3sts sit3

$ echo 'sample item teem eel' | sed ##### add your solution here
sample item t33m 33l

f) For the input file addr.txt, replace all input lines with number of characters in those lines. wc -L is one of the ways to get length of a line as shown below.

$ # note that newline character isn't counted, which is preferable here
$ echo "Hello World" | wc -L
11

$ sed ##### add your solution here
11
11
17
14
5
13

g) For the input file para.txt, assume that it'll always have lines in multiples of 4. Use sed commands such that there are 4 lines at a time in the pattern space. Then, delete from start till end provided start is matched only at the start of a line. Also, match these two keywords case insensitively.

$ sed ##### add your solution here
good start

hi there

bye

h) For the given strings, replace last but third so with X. Only print the lines which are changed by the substitution.

$ printf 'so and so also sow and soup\n' | sed ##### add your solution here
so and X also sow and soup

$ printf 'sososososososo\nso and so\n' | sed ##### add your solution here
sososoXsososo

i) Display all lines that satisfies both of these conditions:

  • professor matched irrespective of case
  • quip or this matched case sensitively

Input is a file downloaded from internet as shown below.

$ wget https://www.gutenberg.org/files/345/old/345.txt -O dracula.txt

$ sed ##### add your solution here
equipment of a professor of the healing craft. When we were shown in,
should be. I could see that the Professor had carried out in this room,
"Not up to this moment, Professor," she said impulsively, "but up to
and sprang at us. But by this time the Professor had gained his feet,
this time the Professor had to ask her questions, and to ask them pretty