Shell Features
This chapter focuses on Bash shell features like quoting mechanisms, wildcards, redirections, command grouping, process substitution, command substitution, etc. Others will be discussed in later chapters.
The example_files directory has the scripts and sample input files used in this chapter.
Some of the examples in this chapter use commands that will be discussed in later chapters. Basic description of what such commands do have been added here and you'll also see more examples in the rest of the chapters.
Quoting mechanisms
This section will quote (heh) the relevant definitions from the bash manual and provide some examples for each of the four mechanisms.
1) Escape Character
A non-quoted backslash
\
is the Bash escape character. It preserves the literal value of the next character that follows, with the exception of newline.metacharacter: A character that, when unquoted, separates words. A metacharacter is a space, tab, newline, or one of the following characters:
|
,&
,;
,(
,)
,<
, or>
.
Here's an example where unquoted shell metacharacter causes an error:
$ echo apple;cherry
apple
cherry: command not found
# '\;' escapes the ';' character, thus losing the metacharacter meaning
$ echo apple\;cherry
apple;cherry
And here's an example where the subtler issue might not be apparent at first glance:
# this will create two files named 'new' and 'file.txt'
# aim was to create a single file named 'new file.txt'
$ touch new file.txt
$ ls new*txt
ls: cannot access 'new*txt': No such file or directory
$ rm file.txt new
# escaping the space will create a single file named 'new file.txt'
$ touch new\ file.txt
$ ls new*txt
'new file.txt'
$ rm new\ file.txt
2) Single Quotes
Enclosing characters in single quotes (
'
) preserves the literal value of each character within the quotes. A single quote may not occur between single quotes, even when preceded by a backslash.
No character is special within single quoted strings. Here's an example:
$ echo 'apple;cherry'
apple;cherry
You can place strings represented by different quoting mechanisms next to each other to concatenate them together. Here's an example:
# concatenation of four strings
# 1: '@fruits = '
# 2: \'
# 3: 'apple and banana'
# 4: \'
$ echo '@fruits = '\''apple and banana'\'
@fruits = 'apple and banana'
3) Double Quotes
Enclosing characters in double quotes (
"
) preserves the literal value of all characters within the quotes, with the exception of$
,`
,\
, and, when history expansion is enabled,!
.
Here's an example showing variable interpolation within double quotes:
$ qty='5'
# as seen earlier, no character is special within single quotes
$ echo 'I bought $qty apples'
I bought $qty apples
# a typical use of double quotes is to enable variable interpolation
$ echo "I bought $qty apples"
I bought 5 apples
Unless you specifically want the shell to interpret the contents of a variable, you should always quote the variable to avoid issues due to the presence of shell metacharacters.
$ f='new file.txt'
# same as: echo 'apple banana' > new file.txt
$ echo 'apple banana' > $f
bash: $f: ambiguous redirect
# same as: echo 'apple banana' > 'new file.txt'
$ echo 'apple banana' > "$f"
$ cat "$f"
apple banana
$ rm "$f"
See also unix.stackexchange: Why does my shell script choke on whitespace or other special characters?.
4) ANSI-C Quoting
Words of the form
$'string'
are treated specially. The word expands to string, with backslash-escaped characters replaced as specified by the ANSI C standard.
This form of quoting helps you use escape sequences like \t
for tab, \n
for newline and so on. You can also represent characters using their codepoint values in octal and hexadecimal formats.
# can also use echo -e 'fig:\t42' or printf 'fig:\t42\n'
$ echo $'fig:\t42'
fig: 42
# \x27 represents the single quote character in hexadecimal format
$ echo $'@fruits = \x27apple and banana\x27'
@fruits = 'apple and banana'
# 'grep' helps you to filter lines based on the given pattern
# but it doesn't recognize escapes like '\t' for tab characters
$ printf 'fig\t42\napple 100\nball\t20\n' | grep '\t'
# in such cases, one workaround is use to ANSI-C quoting
$ printf 'fig\t42\napple 100\nball\t20\n' | grep $'\t'
fig 42
ball 20
printf
is a shell builtin which you can use to format arguments (similar to the printf()
function from the C
programming language). This command will be used in many more examples to come.
See bash manual: ANSI-C Quoting for complete list of supported escape sequences. See
man ascii
for a table of ASCII characters and their numerical representations.
Wildcards
It is relatively easy to specify complete filenames as command arguments when they are few in number. And you could use features like tab completion and middle mouse button click (which pastes the last highlighted text) to assist in such cases.
But what to do if you have to deal with tens and hundreds of files (or even more)? If applicable, one way is to match all the files based on a common pattern in their filenames, for example extensions like .py
, .txt
and so on. Wildcards (globs) will help in such cases. This feature is provided by the shell, and thus individual commands need not worry about implementing them. Pattern matching supported by wildcards are somewhat similar to regular expressions, but there are fundamental and syntactical differences between them.
Some of the commonly used wildcards are listed below:
*
match any character, zero or more times- as a special case,
*
won't match the starting.
of hidden files unless thedotglob
shell option is set
- as a special case,
?
match any character exactly once[set149]
match any of these characters once[^set149]
match any characters except the given set of characters- you can also use
[!set149]
to negate the character class
- you can also use
[a-z]
match a range of characters froma
toz
[0-9a-fA-F]
match any hexadecimal character
And here are some examples:
# change to the 'scripts' directory and source the 'globs.sh' script
$ source globs.sh
$ ls
100.sh f1.txt f4.txt hi.sh math.h report-02.log
42.txt f2_old.txt f7.txt ip.txt notes.txt report-04.log
calc.py f2.txt hello.py main.c report-00.log report-98.log
# beginning with 'c' or 'h' or 't'
$ ls [cht]*
calc.py hello.py hi.sh
# only hidden files and directories
$ ls -d .*
. .. .hidden .somerc
# ending with '.c' or '.py'
$ ls *.c *.py
calc.py hello.py main.c
# containing 'o' as well as 'x' or 'y' or 'z' afterwards
$ ls *o*[xyz]*
f2_old.txt hello.py notes.txt
# ending with '.' and two more characters
$ ls *.??
100.sh calc.py hello.py hi.sh
# shouldn't start with 'f' and ends with '.txt'
$ ls [^f]*.txt
42.txt ip.txt notes.txt
# containing digits '1' to '5' and ending with 'log'
$ ls *[1-5]*log
report-02.log report-04.log
Since some characters are special inside the character class, you need special placement to treat them as ordinary characters:
-
should be the first or the last character in the set^
should be other than the first character]
should be the first character
$ ls *[ns-]*
100.sh main.c report-00.log report-04.log
hi.sh notes.txt report-02.log report-98.log
$ touch 'a^b' 'mars[planet].txt'
$ rm -i *[]^]*
rm: remove regular empty file 'a^b'? y
rm: remove regular empty file 'mars[planet].txt'? y
A named character set is defined by a name enclosed between [:
and :]
and has to be used within a character class []
, along with any other characters as needed.
Named set | Description |
---|---|
[:digit:] | [0-9] |
[:lower:] | [a-z] |
[:upper:] | [A-Z] |
[:alpha:] | [a-zA-Z] |
[:alnum:] | [0-9a-zA-Z] |
[:word:] | [0-9a-zA-Z_] |
[:xdigit:] | [0-9a-fA-F] |
[:cntrl:] | control characters — first 32 ASCII characters and 127th (DEL) |
[:punct:] | all the punctuation characters |
[:graph:] | [:alnum:] and [:punct:] |
[:print:] | [:alnum:] , [:punct:] and space |
[:ascii:] | all the ASCII characters |
[:blank:] | space and tab characters |
[:space:] | whitespace characters |
# starting with a digit character, same as: [0-9]*
$ ls [[:digit:]]*
100.sh 42.txt
# starting with a digit character or 'c'
# same as: [0-9c]*
$ ls [[:digit:]c]*
100.sh 42.txt calc.py
# starting with a non-alphabet character
$ ls [^[:alpha:]]*
100.sh 42.txt
As mentioned before, you can use
echo
to test how the wildcards will expand before using a command to act upon the matching files. For example,echo *.txt
before using commands likerm *.txt
. One difference compared tols
is thatecho
will display the wildcard as is instead of showing an error if there's no match.
See bash manual: Pattern Matching for more details, information on locale stuff and so on.
Brace Expansion
This is not a wildcard feature, you just get expanded strings. Brace expansion has two mechanisms for reducing typing:
- taking out common portions among multiple strings
- generating a range of characters
Say you want to create two files named test_x.txt
and test_y.txt
. These two strings have something in common at the start and the end. You can specify the unique portions as comma separated strings within a pair of curly braces and put the common parts around the braces. Multiple braces can be used as needed. Use echo
for testing purposes.
$ mkdir practice_brace
$ cd practice_brace
# same as: touch ip1.txt ip3.txt ip7.txt
$ touch ip{1,3,7}.txt
$ ls ip*txt
ip1.txt ip3.txt ip7.txt
# same as: mv ip1.txt ip_a.txt
$ mv ip{1,_a}.txt
$ ls ip*txt
ip3.txt ip7.txt ip_a.txt
$ echo adders/{half,full}_adder.v
adders/half_adder.v adders/full_adder.v
$ echo file{0,1}.{txt,log}
file0.txt file0.log file1.txt file1.log
# empty alternate is allowed too
$ echo file{,1}.txt
file.txt file1.txt
# example with nested braces
$ echo file.{txt,log{,.bkp}}
file.txt file.log file.log.bkp
To generate a range, specify numbers or single characters separated by ..
and an optional third argument as the step value. Here are some examples:
$ echo {1..4}
1 2 3 4
$ echo {4..1}
4 3 2 1
$ echo {1..2}{a..b}
1a 1b 2a 2b
$ echo file{1..4}.txt
file1.txt file2.txt file3.txt file4.txt
$ echo file{1..10..2}.txt
file1.txt file3.txt file5.txt file7.txt file9.txt
$ echo file_{x..z}.txt
file_x.txt file_y.txt file_z.txt
$ echo {z..j..-3}
z w t q n k
# '0' prefix
$ echo {008..10}
008 009 010
If the use of braces doesn't match the expansion syntax, it will be left as is:
$ echo file{1}.txt
file{1}.txt
$ echo file{1-4}.txt
file{1-4}.txt
Extended and Recursive globs
From man bash
:
Extended glob | Description |
---|---|
?(pattern-list) | Matches zero or one occurrence of the given patterns |
*(pattern-list) | Matches zero or more occurrences of the given patterns |
+(pattern-list) | Matches one or more occurrences of the given patterns |
@(pattern-list) | Matches one of the given patterns |
!(pattern-list) | Matches anything except one of the given patterns |
Extended globs are disabled by default. You can use the shopt
builtin to set/unset shell options like extglob
, globstar
, etc. You can also check what is the current status of such options.
$ shopt extglob
extglob off
# set extglob
$ shopt -s extglob
$ shopt extglob
extglob on
# unset extglob
$ shopt -u extglob
$ shopt extglob
extglob off
Here are some examples, assuming extglob
option has already been set:
# change to the 'scripts' directory and source the 'globs.sh' script
$ source globs.sh
$ ls
100.sh f1.txt f4.txt hi.sh math.h report-02.log
42.txt f2_old.txt f7.txt ip.txt notes.txt report-04.log
calc.py f2.txt hello.py main.c report-00.log report-98.log
# one or more digits followed by '.' and then zero or more characters
$ ls +([0-9]).*
100.sh 42.txt
# same as: ls *.c *.sh
$ ls *.@(c|sh)
100.sh hi.sh main.c
# not ending with '.txt'
$ ls !(*.txt)
100.sh hello.py main.c report-00.log report-04.log
calc.py hi.sh math.h report-02.log report-98.log
# not ending with '.txt' or '.log'
$ ls *.!(txt|log)
100.sh calc.py hello.py hi.sh main.c math.h
If you enable the globstar
option, you can recursively match filenames within a specified path.
# change to the 'scripts' directory and source the 'ls.sh' script
$ source ls.sh
# with 'find' command (this will be explained in a later chapter)
$ find -name '*.txt'
./todos/books.txt
./todos/outing.txt
./ip.txt
# with 'globstar' enabled
$ shopt -s globstar
$ ls **/*.txt
ip.txt todos/books.txt todos/outing.txt
# another example
$ ls -1 **/*.@(py|html)
backups/bookmarks.html
hello_world.py
projects/tictactoe/game.py
Add the
shopt
invocations to~/.bashrc
if you want these settings applied at terminal startup. This will be discussed in the Shell Customization chapter.
set
The set
builtin command helps you to set or unset values of shell options and positional parameters. Here are some examples for shell options:
# disables logging command history from this point onwards
$ set +o history
# enable history logging
$ set -o history
# use vi-style CLI editing interface
$ set -o vi
# use emacs-style interface, this is usually the default
$ set -o emacs
You'll see more examples (for example, set -x
) in later chapters. See bash manual: Set Builtin for documentation.
Pipelines
The pipe control operator |
helps you connect the output of a command as the input of another command. This operator vastly reduces the need for temporary intermediate files. As discussed previously in the Unix Philosophy section, command line tools usually specialize in a single task. If you can break down a problem into smaller tasks, the pipe operator will come in handy often. Here are some examples:
# change to the 'scripts' directory and source the 'du.sh' script
$ source du.sh
# list of files
$ ls
projects report.log todos
# count the number of files
# you can also use: printf '%q\n' * | wc -l
$ ls -q | wc -l
3
# report the size of files/folders in human readable format
# and then sort them based on human readable sizes in ascending order
$ du -sh * | sort -h
8.0K todos
48K projects
7.4M report.log
In the above examples, ls
and du
perform their own tasks of displaying list of files and showing file sizes respectively. After that, the wc
and sort
commands take care of counting and sorting the lines respectively. In such cases, the pipe operator saves you the trouble of dealing with temporary data.
Note that the %q
format specifier in printf
helps you quote the arguments in a way that is recognizable by the shell. The -q
option for ls
substitutes nongraphic characters in the filenames with a ?
character. Both of these are workarounds to prevent the counting process from getting sidetracked due to characters like newline in the filenames.
The pipe control operator
|&
will be discussed later in this chapter.
tee
Sometimes, you might want to display the command output on the terminal as well as require the results for later use. In such cases, you can use the tee
command:
$ du -sh * | tee sizes.log
48K projects
7.4M report.log
8.0K todos
$ cat sizes.log
48K projects
7.4M report.log
8.0K todos
$ rm sizes.log
Redirection
From bash manual: Redirections:
Before a command is executed, its input and output may be redirected using a special notation interpreted by the shell. Redirection allows commands' file handles to be duplicated, opened, closed, made to refer to different files, and can change the files the command reads from and writes to. Redirection may also be used to modify file handles in the current shell execution environment.
There are three standard data streams:
- standard input (
stdin
— file descriptor 0) - standard output (
stdout
— file descriptor 1) - standard error (
stderr
— file descriptor 2)
Both the standard output and error streams are displayed on the terminal by default. The stderr
stream is used when something goes wrong with the command usage. Each of these three streams have a predefined file descriptor as mentioned above. In this section, you'll see how to redirect these three streams.
Redirections can be placed anywhere, but they are usually used at the start or end of a command. For example, the following two commands are equivalent:
>op.txt grep 'error' report.log grep 'error' report.log >op.txt
Space characters between the redirection operators and the filename are optional.
Redirecting output
You can use the >
operator to redirect the standard output of a command to a file. A number prefix can be added to the >
operator to work with that particular file descriptor. Default is 1
(recall that the file descriptor for stdout
is 1
), so 1>
and >
perform the same operation. Use >>
to append the output to a file.
The filename provided to the >
and >>
operators will be created if a regular file of that name doesn't exist yet. If the file already exists, >
will overwrite that file whereas >>
will append the contents.
# change to the 'example_files/text_files' directory for this section
# save first three lines of 'sample.txt' to 'op.txt'
$ head -n3 sample.txt > op.txt
$ cat op.txt
1) Hello World
2)
3) Hi there
# append last two lines of 'sample.txt' to 'op.txt'
$ tail -n2 sample.txt >> op.txt
$ cat op.txt
1) Hello World
2)
3) Hi there
14) He he he
15) Adios amigo
$ rm op.txt
You can use
/dev/null
as a filename to discard the output, to provide an empty file as input for a command, etc.
You can use
set noclobber
to prevent overwriting if a file already exists. When thenoclobber
option is set, you can still overwrite a file by using>|
instead of the>
operator.
Redirecting input
Some commands like tr
and datamash
can only work with data from the standard input. This isn't an issue when you are piping data from another command, for example:
# filter lines containing 'the' from the input file 'greeting.txt'
# and then display the results in uppercase using the 'tr' command
$ grep 'the' greeting.txt | tr 'a-z' 'A-Z'
HI THERE
You can use the <
redirection operator if you want to pass data from a file to such commands. The default prefix here is 0
, which is the file descriptor for stdin
data. Here's an example:
$ tr 'a-z' 'A-Z' <greeting.txt
HI THERE
HAVE A NICE DAY
In some cases, a tool behaves differently when processing stdin
data compared to file input. Here's an example with wc -l
to report the total number of lines in the input:
# line count, filename is part of the output as well
$ wc -l purchases.txt
8 purchases.txt
# filename won't be part of the output for stdin data
# helpful for assigning the number to a variable for scripting purposes
$ wc -l <purchases.txt
8
Sometimes, you need to pass stdin
data as well as other file inputs to a command. In such cases, you can use -
to represent data from the standard input. Here's an example:
$ cat scores.csv
Name,Maths,Physics,Chemistry
Ith,100,100,100
Cy,97,98,95
Lin,78,83,80
# insert a column at the start
$ printf 'ID\n1\n2\n3' | paste -d, - scores.csv
ID,Name,Maths,Physics,Chemistry
1,Ith,100,100,100
2,Cy,97,98,95
3,Lin,78,83,80
Even though a command accepts file input directly as an argument, redirecting can help for interactive usage. Here's an example:
# display only the third field
$ <scores.csv cut -d, -f3
Physics
100
98
83
# later, you realize that you need the first field too
# use 'up' arrow key to bring the previous command
# and modify the argument easily at the end
# if you had used cut -d, -f3 scores.csv instead,
# you'd have to navigate past the filename to modify the argument
$ <scores.csv cut -d, -f1,3
Name,Physics
Ith,100
Cy,98
Lin,83
Don't use
cat filename | cmd
for passing file content asstdin
data, unless you need to concatenate data from multiple input files. See wikipedia: UUOC and Useless Use of Cat Award for more details.
Redirecting error
Recall that the file descriptor for stderr
is 2
. So, you can use 2>
to redirect standard error to a file. Use 2>>
if you need to append the contents. Here's an example:
# assume 'abcdxyz' doesn't exist as a shell command
$ abcdxyz
abcdxyz: command not found
# the error in such cases will be part of the stderr stream, not stdout
# so, you'll need to use 2> here
$ abcdxyz 2> cmderror.log
$ cat cmderror.log
abcdxyz: command not found
$ rm cmderror.log
Use
/dev/null
as a filename if you need to discard the results.
Combining stdout and stderr
Newer versions of Bash provide these handy shortcuts:
&>
redirect bothstdout
andstderr
(overwrite if file already exists)&>>
redirect bothstdout
andstderr
(append if file already exists)|&
pipe bothstdout
andstderr
as input to another command
Here's an example which assumes xyz.txt
doesn't exist, thus leading to errors:
# using '>' will redirect only the stdout stream
# stderr will be displayed on the terminal
$ grep 'log' file_size.txt xyz.txt > op.txt
grep: xyz.txt: No such file or directory
# using '&>' will redirect both the stdout and stderr streams
$ grep 'log' file_size.txt xyz.txt &> op.txt
$ cat op.txt
file_size.txt:104K power.log
file_size.txt:746K report.log
grep: xyz.txt: No such file or directory
$ rm op.txt
And here's an example with the |&
operator:
# filter lines containing 'log' from the given file arguments
# and then filter lines containing 'or' from the combined stdout and stderr
$ grep 'log' file_size.txt xyz.txt |& grep 'or'
file_size.txt:746K report.log
grep: xyz.txt: No such file or directory
For earlier Bash versions, you'll have to manually redirect the streams:
1>&2
redirects file descriptor1
(stdout
) to the file descriptor2
(stderr
)2>&1
redirects file descriptor2
(stderr
) to the file descriptor1
(stdout
)
Here are some examples:
# note that the order of redirections is important here
# you can also use: 2> op.txt 1>&2
$ grep 'log' file_size.txt xyz.txt > op.txt 2>&1
$ cat op.txt
file_size.txt:104K power.log
file_size.txt:746K report.log
grep: xyz.txt: No such file or directory
$ rm op.txt
$ grep 'log' file_size.txt xyz.txt 2>&1 | grep 'or'
file_size.txt:746K report.log
grep: xyz.txt: No such file or directory
Waiting for stdin
Sometimes, you might mistype a command without providing input. And instead of getting an error, you'll see the cursor patiently waiting for something. This isn't the shell hanging up on you. The command is waiting for you to type data, so that it can perform its task.
Say, you typed cat
and pressed the Enter key. Seeing the blinking cursor, you type some text and press the Enter key again. You'll see the text you just typed echoed back to you as stdout
(which is the functionality of the cat
command). This will continue again and again, until you tell the shell that you are done. How to do that? Press Ctrl+d
on a fresh line or press Ctrl+d
twice at the end of a line. In the latter case, you'll not get a newline character at the end of the data.
# press Enter and Ctrl+d after typing all the required characters
$ cat
knock knock
knock knock
anybody here?
anybody here?
# 'tr' command here translates lowercase to uppercase
$ tr 'a-z' 'A-Z'
knock knock
KNOCK KNOCK
anybody here?
ANYBODY HERE?
Getting output immediately after each input line depends on the command's functionality. Commands like
sort
andshuf
will wait for the entire input data before producing the output.# press Ctrl+d after the third input line $ sort lion zebra bee bee lion zebra
Here's an example which has output redirection as well:
# press Ctrl+d after the line containing 'histogram'
# filter lines containing 'is'
$ grep 'is' > op.txt
hi there
this is a sample line
have a nice day
histogram
$ cat op.txt
this is a sample line
histogram
$ rm op.txt
See also unix.stackexchange: difference between Ctrl+c and Ctrl+d.
Here Documents
Here Documents is another way to provide stdin
data. In this case, the termination condition is a line matching a predefined string which is specified after the <<
redirection operator. This is especially helpful for automation, since pressing Ctrl+d
interactively isn't desirable. Here's an example:
# EOF is typically used as the special string
$ cat << 'EOF' > fruits.txt
> banana 2
> papaya 3
> mango 10
> EOF
$ cat fruits.txt
banana 2
papaya 3
mango 10
$ rm fruits.txt
In the above example, the termination string was enclosed in single quotes as a good practice. Doing so prevents parameter expansion, command substitution, etc. You can also use \string
for this purpose. If you use <<-
instead of <<
, leading tab characters can be added at the start of input lines without being part of the actual data.
Just like
$
and a space represents the primary prompt (PS1
shell variable),>
and a space at the start of lines represents the secondary promptPS2
(applicable for multiline commands). Don't type these characters when you use Here Documents in a shell script.
See bash manual: Here Documents and stackoverflow: here documents for more examples and details.
Here Strings
This is similar to Here Documents, but the string is passed as an argument after the <<<
redirection operator. Here are some examples:
$ tr 'a-z' 'A-Z' <<< hello
HELLO
$ tr 'a-z' 'A-Z' <<< 'hello world'
HELLO WORLD
$ greeting='hello world'
$ tr 'a-z' 'A-Z' > op.txt <<< "$greeting"
$ cat op.txt
HELLO WORLD
$ rm op.txt
Further Reading
- Short introduction to shell redirection
- Illustrated Redirection Tutorial
- stackoverflow: Redirect a stream to another file descriptor using >&
- Difference between 2>&1 >foo and >foo 2>&1
- stackoverflow: Redirect and append both stdout and stderr to a file
- unix.stackexchange: Examples for <> redirection
Grouping commands
You can use the (list)
and { list; }
compound commands to redirect content for several commands. The former is executed in a subshell whereas the latter is executed in the current shell context. Spaces around ()
are optional but necessary for the {}
version. From bash manual: Lists of Commands:
A
list
is a sequence of one or more pipelines separated by one of the operators;
,&
,&&
, or||
, and optionally terminated by one of;
,&
, or a newline.
Here are some examples of command groupings:
# change to the 'example_files/text_files' directory for this section
# the 'sed' command here gives the first line of the input
# rest of the lines are then processed by the 'sort' command
# thus, the header will always be the first line in the output
$ (sed -u '1q' ; sort) < scores.csv
Name,Maths,Physics,Chemistry
Cy,97,98,95
Ith,100,100,100
Lin,78,83,80
# save first three and last two lines from 'sample.txt' to 'op.txt'
$ { head -n3 sample.txt; tail -n2 sample.txt; } > op.txt
$ cat op.txt
1) Hello World
2)
3) Hi there
14) He he he
15) Adios amigo
$ rm op.txt
You might wonder why the second command did not use < sample.txt
instead of repeating the filename twice. The reason is that some commands might read more than what is required (for buffering purposes) and thus cause issues for the remaining commands. In the sed+sort
example, the -u
option guarantees that sed
will not to read more than the required data. See unix.stackexchange: sort but keep header line at the top for more examples and details.
You don't need the
()
or{}
groups to see the results of multiple commands on the terminal. Just the;
separator between the commands would be enough. See also bash manual: Command Execution Environment.$ head -n1 sample.txt ; echo 'have a nice day' 1) Hello World have a nice day
List control operators
You can use these operators to control the execution of the subsequent command depending on the exit status of the first command. From bash manual: Lists of Commands:
AND and OR lists are sequences of one or more pipelines separated by the control operators
&&
and||
, respectively. AND and OR lists are executed with left associativity.
For AND list, the second command will be executed if and only if the first command exits with 0
status.
# first command succeeds here, so the second command is also executed
$ echo 'hello' && echo 'have a nice day'
hello
have a nice day
# assume 'abcdxyz' doesn't exist as a shell command
# the second command will not be executed
$ abcdxyz && echo 'have a nice day'
abcdxyz: command not found
# if you use ';' instead, the second command will still be executed
$ abcdxyz ; echo 'have a nice day'
abcdxyz: command not found
have a nice day
For OR list, the second command will be executed if and only if the first command does not exit with 0
status.
# since the first command succeeds, the second one won't run
$ echo 'hello' || echo 'have a nice day'
hello
# assume 'abcdxyz' doesn't exist as a shell command
# since the first command fails, the second one will run
$ abcdxyz || echo 'have a nice day'
abcdxyz: command not found
have a nice day
Command substitution
Command substitution allows you to use the standard output of a command as part of another command. Trailing newlines, if any, will be removed. You can use the newer and preferred syntax $(command)
or the older syntax `command`
. Here are some examples:
# sample input
$ printf 'hello\ntoday is: \n'
hello
today is:
# append output from the 'date' command to the line containing 'today'
$ printf 'hello\ntoday is: \n' | sed '/today/ s/$/'"$(date +%A)"'/'
hello
today is: Monday
# save the output of 'wc' command to a variable
# same as: line_count=`wc -l <sample.txt`
$ line_count=$(wc -l <sample.txt)
$ echo "$line_count"
15
Here's an example with nested substitutions:
# dirname removes the trailing path component
$ dirname projects/tictactoe/game.py
projects/tictactoe
# basename removes the leading directory component
$ basename projects/tictactoe
tictactoe
$ proj=$(basename $(dirname projects/tictactoe/game.py))
$ echo "$proj"
tictactoe
Difference between the two types of syntax is quoted below from bash manual: Command Substitution:
When the old-style backquote form of substitution is used, backslash retains its literal meaning except when followed by
$
,`
, or\
. The first backquote not preceded by a backslash terminates the command substitution. When using the $(command) form, all characters between the parentheses make up the command; none are treated specially.Command substitutions may be nested. To nest when using the backquoted form, escape the inner backquotes with backslashes.
Process substitution
Instead of a file argument, you can use command output with process substitution. The syntax is <(list)
. The shell will take care of passing a filename with the standard output of those commands. Here's an example:
# change to the 'example_files/text_files' directory for this section
$ cat scores.csv
Name,Maths,Physics,Chemistry
Ith,100,100,100
Cy,97,98,95
Lin,78,83,80
# can also use: paste -d, <(echo 'ID'; seq 3) scores.csv
$ paste -d, <(printf 'ID\n1\n2\n3') scores.csv
ID,Name,Maths,Physics,Chemistry
1,Ith,100,100,100
2,Cy,97,98,95
3,Lin,78,83,80
For the above example, you could also have used -
to represent stdin
piped data as seen in an earlier section. Here's an example where two substitutions are used. This essentially helps you to avoid managing multiple temporary files, similar to how the |
pipe operator helps for single temporary file.
# side-by-side view of sample input files
$ paste f1.txt f2.txt
1 1
2 hello
3 3
world 4
# this command gives the common lines between two files
# the files have to be sorted for the command to work properly
$ comm -12 <(sort f1.txt) <(sort f2.txt)
1
3
See this unix.stackexchange thread for examples with the
>(list)
form.
Exercises
Use the
globs.sh
script for wildcards related exercises, unless otherwise mentioned.
Create a temporary directory for exercises that may require you to create some files. You can delete such practice directories afterwards.
1) Use the echo
command to display the text as shown below. Use appropriate quoting as necessary.
# ???
that's great! $x = $y + $z
2) Use the echo
command to display the values of the three variables in the format as shown below.
$ n1=10
$ n2=90
$ op=100
# ???
10 + 90 = 100
3) What will be the output of the command shown below?
$ echo $'\x22apple\x22: \x2710\x27'
4) List filenames starting with a digit character.
# change to the 'scripts' directory and source the 'globs.sh' script
$ source globs.sh
# ???
100.sh 42.txt
5) List filenames whose extension do not begin with t
or l
. Assume extensions will have at least one character.
# ???
100.sh calc.py hello.py hi.sh main.c math.h
6) List filenames whose extension only have a single character.
# ???
main.c math.h
7) List filenames whose extension is not txt
.
# ???
100.sh hello.py main.c report-00.log report-04.log
calc.py hi.sh math.h report-02.log report-98.log
8) Describe the wildcard pattern used in the command shown below.
$ ls *[^[:word:]]*.*
report-00.log report-02.log report-04.log report-98.log
9) List filenames having only lowercase alphabets before the extension.
# ???
calc.py hello.py hi.sh ip.txt main.c math.h notes.txt
10) List filenames starting with ma
or he
or hi
.
# ???
hello.py hi.sh main.c math.h
11) What commands would you use to get the outputs shown below? Assume that you do not know the depth of sub-directories.
# change to the 'scripts' directory and source the 'ls.sh' script
$ source ls.sh
# filenames ending with '.txt'
# ???
ip.txt todos/books.txt todos/outing.txt
# directories starting with 'c' or 'd' or 'g' or 'r' or 't'
# ???
backups/dot_files/
projects/calculator/
projects/tictactoe/
todos/
12) Create and change to an empty directory. Then, use brace expansion along with relevant commands to get the results shown below.
# ???
$ ls report*
report_2020.txt report_2021.txt report_2022.txt
# use the 'cp' command here
# ???
$ ls report*
report_2020.txt report_2021.txt report_2021.txt.bkp report_2022.txt
13) What does the set
builtin command do?
14) What does the |
pipe operator do? And when would you add the tee
command?
15) Can you infer what the following command does? Hint: see help printf
.
$ printf '%s\n' apple car dragon
apple
car
dragon
16) Use brace expansion along with relevant commands and shell features to get the result shown below. Hint: see previous question.
$ ls ip.txt
ls: cannot access 'ip.txt': No such file or directory
# ???
$ cat ip.txt
item_10
item_12
item_14
item_16
item_18
item_20
17) With ip.txt
containing text as shown in the previous question, use brace expansion and relevant commands to get the result shown below.
# ???
$ cat ip.txt
item_10
item_12
item_14
item_16
item_18
item_20
apple_1_banana_6
apple_1_banana_7
apple_1_banana_8
apple_2_banana_6
apple_2_banana_7
apple_2_banana_8
apple_3_banana_6
apple_3_banana_7
apple_3_banana_8
18) What are the differences between <
and |
shell operators, if any?
19) Which character is typically used to represent stdin
data as a file argument?
20) What do the following operators do?
a) 1>
b) 2>
c) &>
d) &>>
e) |&
21) What will be the contents of op.txt
if you use the following grep
command?
# press Ctrl+d after the line containing 'histogram'
$ grep 'hi' > op.txt
hi there
this is a sample line
have a nice day
histogram
$ cat op.txt
22) What will be the contents of op.txt
if you use the following commands?
$ qty=42
$ cat << end > op.txt
> dragon
> unicorn
> apple $qty
> ice cream
> end
$ cat op.txt
23) Correct the command to get the expected output shown below.
$ books='cradle piranesi soulhome bastion'
# something is wrong with this command
$ sed 's/\b\w/\u&/g' <<< '$books'
$Books
# ???
Cradle Piranesi Soulhome Bastion
24) Correct the command to get the expected output shown below.
# something is wrong with this command
$ echo 'hello' ; seq 3 > op.txt
hello
$ cat op.txt
1
2
3
# ???
$ cat op.txt
hello
1
2
3
25) What will be the output of the following commands?
$ printf 'hello' | tr 'a-z' 'A-Z' && echo ' there'
$ printf 'hello' | tr 'a-z' 'A-Z' || echo ' there'
26) Correct the command(s) to get the expected output shown below.
# something is wrong with these commands
$ nums=$(seq 3)
$ echo $nums
1 2 3
# ???
1
2
3
27) Will the following two commands produce equivalent output? If not, why not?
$ paste -d, <(seq 3) <(printf '%s\n' item_{1..3})
$ printf '%s\n' {1..3},item_{1..3}