Shell substitutions

So far, the sed commands have been constructed statically. All the details were known. For example, which line numbers to act upon, the search REGEXP, the replacement string and so on. When it comes to automation and scripting, you'd often need to construct commands dynamically based on user input, file contents, etc. And sometimes, output of a shell command is needed as part of the replacement string. This chapter will discuss how to incorporate shell variables and command output to compose a sed command dynamically. As mentioned before, this book assumes GNU bash as the shell being used.

info As an example, see my tool ch: command help for a practical shell script, where commands are constructed dynamically.

info The example_files directory has all the files used in the examples.

Variable substitution

The characters you type on the command line are first interpreted by the shell before it can be executed. Wildcards are matched against filenames, pipes and redirections are set up, double quotes are interpolated and so on. For some use cases, it is simply easier to use double quotes instead of single quotes for the script passed to the sed command. That way, shell variables get substituted for their values.

info See wooledge: Quotes and unix.stackexchange: Why does my shell script choke on whitespace or other special characters? for details about various quoting mechanisms in bash and when quoting is needed.

In the examples below, the start and end numeric address values are provided via shell variables. The entire script can be safely enclosed in double quotes for such cases.

$ start=6; step=1
$ sed -n "${start},+${step}p" rhymes.txt
There are so many delights to cherish
Apple, Banana and Cherry

$ step=3
$ sed -n "${start},+${step}p" rhymes.txt
There are so many delights to cherish
Apple, Banana and Cherry
Bread, Butter and Jelly
Try them all before you perish

But, if the shell variables can contain any generic string instead of just numbers, it is strongly recommended to use double quotes only where it is needed. Otherwise, normal characters that are part of the sed command may get interpolated because of double quotes. The Bash shell allows unquoted, single quoted and double quoted strings to be concatenated by simply placing them next to each other.

# ! is special within double quotes
# !d got expanded to 'date -Is' from my history and hence the error
$ word='at'
$ printf 'sea\neat\ndrop\n' | sed "/${word}/!d"
printf 'sea\neat\ndrop\n' | sed "/${word}/date -Is"
sed: -e expression #1, char 6: extra characters after command

# use double quotes only for the variable substitution
# and single quotes for everything else
# the command is concatenation of '/' and "${word}" and '/!d'
$ printf 'sea\neat\ndrop\n' | sed '/'"${word}"'/!d'
eat

After you've properly separated single and double quoted portions, you need to take care of few more things to robustly construct a dynamic command. First, you'll have to ensure that the shell variable is properly preprocessed to avoid conflict with whichever delimiter is being used for search and substitution operations.

info See wooledge: Parameter Expansion for details about the Bash feature used in the example below.

# error because '/' inside HOME value conflicts with '/' as the delimiter
$ echo 'home path is:' | sed 's/$/ '"${HOME}"'/'
sed: -e expression #1, char 7: unknown option to `s'
# using a different delimiter will help in this particular case
$ echo 'home path is:' | sed 's|$| '"${HOME}"'|'
home path is: /home/learnbyexample

# but you may not have the luxury of choosing a delimiter
# in such cases, escape all delimiter characters before variable substitution
$ home=${HOME//\//\\/}
$ echo 'home path is:' | sed 's/$/ '"${home}"'/'
home path is: /home/learnbyexample

warning warning warning If the variable value is obtained from an external source, such as user input, then you need to worry about security too. See unix.stackexchange: security consideration when using shell substitution for more details.

Escaping metacharacters

Next, you have to properly escape all the metacharacters depending upon whether the shell variable is used as the search or the replacement string. This is needed only if the content of the variable has to be treated literally. Here's an example to illustrate the issue with one metacharacter.

$ c='&'
# & backreferences the entire matched portion
$ echo 'a and b and c' | sed 's/and/['"${c}"']/g'
a [and] b [and] c

# escape all occurrences of & to insert it literally
$ c1=${c//&/\\&}
$ echo 'a and b and c' | sed 's/and/['"${c1}"']/g'
a [&] b [&] c

Typically, you'd need to escape \, & and the delimiter for variables used in the replacement section. For the search section, the characters to be escaped will depend upon whether you are using BRE or ERE.

# replacement string
$ r='a/b&c\d'
$ r=$(printf '%s' "$r" | sed 's#[\&/]#\\&#g')

# ERE version for search string
$ s='{[(\ta^b/d).*+?^$|]}'
$ s=$(printf '%s' "$s" | sed 's#[{[()^$*?+.\|/]#\\&#g')
$ echo 'f*{[(\ta^b/d).*+?^$|]} - 3' | sed -E 's/'"$s"'/'"$r"'/g'
f*a/b&c\d - 3

# BRE version for search string
$ s='{[(\ta^b/d).*+?^$|]}'
$ s=$(printf '%s' "$s" | sed 's#[[^$*.\/]#\\&#g')
$ echo 'f*{[(\ta^b/d).*+?^$|]} - 3' | sed 's/'"$s"'/'"$r"'/g'
f*a/b&c\d - 3

For a more detailed analysis on escaping the metacharacters, check these Q&A threads:

info See also wooledge: Why is $() preferred over backticks?

Command substitution

This section will show examples of using the output of a shell command as part of the sed script. All the precautions seen earlier applies here too.

# note that the trailing newline character of the command output gets stripped
$ echo 'today is date.' | sed 's/date/'"$(date -I)"'/'
today is 2023-05-30.

# in some cases, changing the delimiter alone is enough
$ printf 'f1.txt\nf2.txt\n' | sed 's|^|'"$(pwd)"'/|'
/home/learnbyexample/f1.txt
/home/learnbyexample/f2.txt

# for a robust solution, always preprocess the command output
$ p=$(pwd | sed 's|/|\\/|g')
$ printf 'f1.txt\nf2.txt\n' | sed 's/^/'"${p}"'\//'
/home/learnbyexample/f1.txt
/home/learnbyexample/f2.txt

warning Multiline command output cannot be substituted in this manner, as the substitute command doesn't allow literal newlines in the replacement section unless escaped.

$ printf 'a\n[x]\nb\n' | sed 's/x/'"$(seq 3)"'/'
sed: -e expression #1, char 5: unterminated `s' command
# prefix literal newlines with \ except the last newline
$ printf 'a\n[x]\nb\n' | sed 's/x/'"$(seq 3 | sed '$!s/$/\\/' )"'/'
a
[1
2
3]
b

Cheatsheet and summary

NoteDescription
sed -n "${start},+${step}p"dynamically constructed sed command
start and step are shell variables
their values gets substituted before sed is executed
sed "/${word}/!d"entire command in double quotes is risky
within double quotes, $, \, ! and ` are special
sed '/'"${word}"'/!d'use double quotes only where needed
and variable contents have to be preprocessed to prevent
clashing with sed metacharacters and security issue
if you don't control the variable contents
sed 's#[\&/]#\\&#g'escape metacharacters for replacement section
sed '$!s/$/\\/'escape literal newlines for replacement section
sed 's#[{[()^$*?+.\|/]#\\&#g'escape metacharacters for search section, ERE
sed 's#[[^$*.\/]#\\&#g'escape metacharacters for search section, BRE
sed 's/date/'"$(date -I)"'/'example for command substitution
command output's final newline character gets stripped
other literal newlines, if any, have to be escaped

This chapter covered some of the ways to construct a sed command dynamically. Like most things in software programming, 90% of the cases are relatively easier to accomplish. But the other 10% could get significantly complicated. Dealing with the clash between the shell and sed metacharacters is a mess and I'd even suggest looking for alternatives such as perl to reduce the complexity. The next chapter will cover some more command line options.

Exercises

1) Replace #expr# with the value of the usr_ip shell variable. Assume that this variable can only contain the metacharacters as shown in the sample below.

$ usr_ip='c = (a/b) && (x-5)'
$ mod_ip=$(echo "$usr_ip" | sed ##### add your solution here)
$ echo 'Expression: #expr#' | sed ##### add your solution here
Expression: c = (a/b) && (x-5)

2) Repeat the previous exercise, but this time with command substitution instead of using a temporary variable.

$ usr_ip='c = (a/b/y) && (x-5)'
$ echo 'Expression: #expr#' | sed ##### add your solution here
Expression: c = (a/b/y) && (x-5)