cut
cut
is a handy tool for many field processing use cases. The features are limited compared to awk
and perl
commands, but the reduced scope also leads to faster processing.
Individual field selections
By default, cut
splits the input content into fields based on the tab character. You can use the -f
option to select a desired field from each input line. To extract multiple fields, specify the selections separated by the comma character.
# only the second field
$ printf 'apple\tbanana\tcherry\n' | cut -f2
banana
# first and third fields
$ printf 'apple\tbanana\tcherry\n' | cut -f1,3
apple cherry
cut
will always display the selected fields in ascending order. And you cannot display a field more than once.
# same as: cut -f1,3
$ printf 'apple\tbanana\tcherry\n' | cut -f3,1
apple cherry
# same as: cut -f1,2
$ printf 'apple\tbanana\tcherry\n' | cut -f1,1,2,1,2,1,1,2
apple banana
By default, cut
uses the newline character as the line separator. cut
will add a newline character to the output even if the last input line doesn't end with a newline.
$ printf 'good\tfood\ntip\ttap' | cut -f2
food
tap
Field ranges
You can use the -
character to specify field ranges. You can skip the starting or ending range, but not both.
# 2nd, 3rd and 4th fields
$ printf 'apple\tbanana\tcherry\tfig\tmango\n' | cut -f2-4
banana cherry fig
# all fields from the start till the 3rd field
$ printf 'apple\tbanana\tcherry\tfig\tmango\n' | cut -f-3
apple banana cherry
# all fields from the 3rd one till the end
$ printf 'apple\tbanana\tcherry\tfig\tmango\n' | cut -f3-
cherry fig mango
Input field delimiter
Use the -d
option to change the input delimiter. Only a single byte character is allowed. By default, the output delimiter will be same as the input delimiter.
$ cat scores.csv
Name,Maths,Physics,Chemistry
Ith,100,100,100
Cy,97,98,95
Lin,78,83,80
$ cut -d, -f2,4 scores.csv
Maths,Chemistry
100,100
97,95
78,80
# use quotes if the delimiter is a shell metacharacter
$ echo 'one;two;three;four' | cut -d; -f3
cut: option requires an argument -- 'd'
Try 'cut --help' for more information.
-f3: command not found
$ echo 'one;two;three;four' | cut -d';' -f3
three
Output field delimiter
Use the --output-delimiter
option to customize the output separator to any string of your choice. The string is treated literally. Depending on your shell you can use ANSI-C quoting to allow escape sequences.
# same as: tr '\t' ','
$ printf 'apple\tbanana\tcherry\n' | cut --output-delimiter=, -f1-
apple,banana,cherry
# example for multicharacter output separator
$ echo 'one;two;three;four' | cut -d';' --output-delimiter=' : ' -f1,3-
one : three : four
# ANSI-C quoting example
# depending on your environment, you can also press Ctrl+v and then the Tab key
$ echo 'one;two;three;four' | cut -d';' --output-delimiter=$'\t' -f1,3-
one three four
# newline as the output field separator
$ echo 'one;two;three;four' | cut -d';' --output-delimiter=$'\n' -f2,4
two
four
Complement
The --complement
option allows you to invert the field selections.
# except the second field
$ printf 'apple ball cat\n1 2 3 4 5' | cut --complement -d' ' -f2
apple cat
1 3 4 5
# except the first and third fields
$ printf 'apple ball cat\n1 2 3 4 5' | cut --complement -d' ' -f1,3
ball
2 4 5
Suppress lines without delimiters
By default, lines not containing the input delimiter will still be part of the output. You can use the -s
option to suppress such lines.
$ cat mixed_fields.csv
1,2,3,4
hello
a,b,c
# second line doesn't have the comma separator
# by default, such lines will be part of the output
$ cut -d, -f2 mixed_fields.csv
2
hello
b
# use the -s option to suppress such lines
$ cut -sd, -f2 mixed_fields.csv
2
b
$ cut --complement -sd, -f2 mixed_fields.csv
1,3,4
a,c
If a line contains the specified delimiter but doesn't have the field number requested, you'll get a blank line. The
-s
option has no effect on such lines.$ printf 'apple ball cat\n1 2 3 4 5' | cut -d' ' -f4 4
Character selections
You can use the -b
or -c
options to select specific bytes from each input line. The syntax is same as the -f
option. The -c
option is intended for multibyte character selection, but for now it works exactly as the -b
option. Character selection is useful for working with fixed-width fields.
$ printf 'apple\tbanana\tcherry\n' | cut -c2,8,11
pan
$ printf 'apple\tbanana\tcherry\n' | cut -c2,8,11 --output-delimiter=-
p-a-n
$ printf 'apple\tbanana\tcherry\n' | cut -c-5
apple
$ printf 'apple\tbanana\tcherry\n' | cut --complement -c13-
apple banana
$ printf 'cat-bat\ndog:fog\nget;pet' | cut -c5-
bat
fog
pet
NUL separator
Use the -z
option if you want to use NUL character as the line separator. In this scenario, cut
will ensure to add a final NUL character even if not present in the input.
$ printf 'good-food\0tip-tap\0' | cut -zd- -f2 | cat -v
food^@tap^@
Alternatives
Here are some alternate commands you can explore if cut
isn't enough to solve your task.
- hck — supports regexp delimiters, field reordering, header based selection, etc
- choose — negative indexing, regexp based delimiters, etc
- xsv — fast CSV command line toolkit
- rcut — my
bash+awk
script, supports regexp delimiters, field reordering, negative indexing, etc - awk — my ebook on
GNU awk
one-liners - perl — my ebook on Perl one-liners
Exercises
The exercises directory has all the files used in this section.
1) Display only the third field.
$ printf 'tea\tcoffee\tchocolate\tfruit\n' | ##### add your solution here
chocolate
2) Display the second and fifth fields. Consider ,
as the field separator.
$ echo 'tea,coffee,chocolate,ice cream,fruit' | ##### add your solution here
coffee,fruit
3) Why does the below command not work as expected? What other tools can you use in such cases?
# not working as expected
$ echo 'apple,banana,cherry,fig' | cut -d, -f3,1,3
apple,cherry
# expected output
$ echo 'apple,banana,cherry,fig' | ##### add your solution here
cherry,apple,cherry
4) Display except the second field in the format shown below. Can you construct two different solutions?
# solution 1
$ echo 'apple,banana,cherry,fig' | ##### add your solution here
apple cherry fig
# solution 2
$ echo '2,3,4,5,6,7,8' | ##### add your solution here
2 4 5 6 7 8
5) Extract the first three characters from the input lines as shown below. Can you also use the head
command for this purpose? If not, why not?
$ printf 'apple\nbanana\ncherry\nfig\n' | ##### add your solution here
app
ban
che
fig
6) Display only the first and third fields of the scores.csv
input file, with tab as the output field separator.
$ cat scores.csv
Name,Maths,Physics,Chemistry
Ith,100,100,100
Cy,97,98,95
Lin,78,83,80
##### add your solution here
Name Physics
Ith 100
Cy 98
Lin 83
7) The given input data uses one or more :
characters as the field separator. Assume that no field content will have the :
character. Display except the second field, with :
as the output field separator.
$ cat books.txt
Cradle:::Mage Errant::The Weirkey Chronicles
Mother of Learning::Eight:::::Dear Spellbook:Ascendant
Mark of the Fool:Super Powereds:::Ends of Magic
##### add your solution here
Cradle : The Weirkey Chronicles
Mother of Learning : Dear Spellbook : Ascendant
Mark of the Fool : Ends of Magic
8) Which option would you use to not display lines that do not contain the input delimiter character?
9) Modify the command to get the expected output shown below.
$ printf 'apple\nbanana\ncherry\n' | cut -c-3 --output-delimiter=:
app
ban
che
$ printf 'apple\nbanana\ncherry\n' | ##### add your solution here
a:p:p
b:a:n
c:h:e
10) Figure out the logic based on the given input and output data.
$ printf 'apple\0fig\0carpet\0jeep\0' | ##### add your solution here | cat -v
ple^@g^@rpet^@ep^@