cut

cut is a handy tool for many field processing use cases. The features are limited compared to awk and perl commands, but the reduced scope also leads to faster processing.

Individual field selections

By default, cut splits the input content into fields based on the tab character. You can use the -f option to select a desired field from each input line. To extract multiple fields, specify the selections separated by the comma character.

# only the second field
$ printf 'apple\tbanana\tcherry\n' | cut -f2
banana

# first and third fields
$ printf 'apple\tbanana\tcherry\n' | cut -f1,3
apple   cherry

cut will always display the selected fields in ascending order. And you cannot display a field more than once.

# same as: cut -f1,3
$ printf 'apple\tbanana\tcherry\n' | cut -f3,1
apple   cherry

# same as: cut -f1,2
$ printf 'apple\tbanana\tcherry\n' | cut -f1,1,2,1,2,1,1,2
apple   banana

By default, cut uses the newline character as the line separator. cut will add a newline character to the output even if the last input line doesn't end with a newline.

$ printf 'good\tfood\ntip\ttap' | cut -f2
food
tap

Field ranges

You can use the - character to specify field ranges. You can skip the starting or ending range, but not both.

# 2nd, 3rd and 4th fields
$ printf 'apple\tbanana\tcherry\tfig\tmango\n' | cut -f2-4
banana  cherry  fig

# all fields from the start till the 3rd field
$ printf 'apple\tbanana\tcherry\tfig\tmango\n' | cut -f-3
apple   banana  cherry

# all fields from the 3rd one till the end
$ printf 'apple\tbanana\tcherry\tfig\tmango\n' | cut -f3-
cherry  fig     mango

Input field delimiter

Use the -d option to change the input delimiter. Only a single byte character is allowed. By default, the output delimiter will be same as the input delimiter.

$ cat scores.csv
Name,Maths,Physics,Chemistry
Ith,100,100,100
Cy,97,98,95
Lin,78,83,80

$ cut -d, -f2,4 scores.csv
Maths,Chemistry
100,100
97,95
78,80

# use quotes if the delimiter is a shell metacharacter
$ echo 'one;two;three;four' | cut -d; -f3
cut: option requires an argument -- 'd'
Try 'cut --help' for more information.
-f3: command not found
$ echo 'one;two;three;four' | cut -d';' -f3
three

Output field delimiter

Use the --output-delimiter option to customize the output separator to any string of your choice. The string is treated literally. Depending on your shell you can use ANSI-C quoting to allow escape sequences.

# same as: tr '\t' ','
$ printf 'apple\tbanana\tcherry\n' | cut --output-delimiter=, -f1-
apple,banana,cherry

# example for multicharacter output separator
$ echo 'one;two;three;four' | cut -d';' --output-delimiter=' : ' -f1,3-
one : three : four

# ANSI-C quoting example
# depending on your environment, you can also press Ctrl+v and then the Tab key
$ echo 'one;two;three;four' | cut -d';' --output-delimiter=$'\t' -f1,3-
one     three   four

# newline as the output field separator
$ echo 'one;two;three;four' | cut -d';' --output-delimiter=$'\n' -f2,4
two
four

Complement

The --complement option allows you to invert the field selections.

# except the second field
$ printf 'apple ball cat\n1 2 3 4 5' | cut --complement -d' ' -f2
apple cat
1 3 4 5

# except the first and third fields
$ printf 'apple ball cat\n1 2 3 4 5' | cut --complement -d' ' -f1,3
ball
2 4 5

Suppress lines without delimiters

By default, lines not containing the input delimiter will still be part of the output. You can use the -s option to suppress such lines.

$ cat mixed_fields.csv
1,2,3,4
hello
a,b,c

# second line doesn't have the comma separator
# by default, such lines will be part of the output
$ cut -d, -f2 mixed_fields.csv
2
hello
b

# use the -s option to suppress such lines
$ cut -sd, -f2 mixed_fields.csv
2
b

$ cut --complement -sd, -f2 mixed_fields.csv
1,3,4
a,c

info If a line contains the specified delimiter but doesn't have the field number requested, you'll get a blank line. The -s option has no effect on such lines.

$ printf 'apple ball cat\n1 2 3 4 5' | cut -d' ' -f4

4

Character selections

You can use the -b or -c options to select specific bytes from each input line. The syntax is same as the -f option. The -c option is intended for multibyte character selection, but for now it works exactly as the -b option. Character selection is useful for working with fixed-width fields.

$ printf 'apple\tbanana\tcherry\n' | cut -c2,8,11
pan

$ printf 'apple\tbanana\tcherry\n' | cut -c2,8,11 --output-delimiter=-
p-a-n

$ printf 'apple\tbanana\tcherry\n' | cut -c-5
apple

$ printf 'apple\tbanana\tcherry\n' | cut --complement -c13-
apple   banana

$ printf 'cat-bat\ndog:fog\nget;pet' | cut -c5-
bat
fog
pet

NUL separator

Use the -z option if you want to use NUL character as the line separator. In this scenario, cut will ensure to add a final NUL character even if not present in the input.

$ printf 'good-food\0tip-tap\0' | cut -zd- -f2 | cat -v
food^@tap^@

Alternatives

Here are some alternate commands you can explore if cut isn't enough to solve your task.

  • hck — supports regexp delimiters, field reordering, header based selection, etc
  • choose — negative indexing, regexp based delimiters, etc
  • xsv — fast CSV command line toolkit
  • rcut — my bash+awk script, supports regexp delimiters, field reordering, negative indexing, etc
  • awk — my ebook on GNU awk one-liners
  • perl — my ebook on Perl one-liners

Exercises

info The exercises directory has all the files used in this section.

1) Display only the third field.

$ printf 'tea\tcoffee\tchocolate\tfruit\n' | ##### add your solution here
chocolate

2) Display the second and fifth fields. Consider , as the field separator.

$ echo 'tea,coffee,chocolate,ice cream,fruit' | ##### add your solution here
coffee,fruit

3) Why does the below command not work as expected? What other tools can you use in such cases?

# not working as expected
$ echo 'apple,banana,cherry,fig' | cut -d, -f3,1,3
apple,cherry

# expected output
$ echo 'apple,banana,cherry,fig' | ##### add your solution here
cherry,apple,cherry

4) Display except the second field in the format shown below. Can you construct two different solutions?

# solution 1
$ echo 'apple,banana,cherry,fig' | ##### add your solution here
apple cherry fig

# solution 2
$ echo '2,3,4,5,6,7,8' | ##### add your solution here
2 4 5 6 7 8

5) Extract the first three characters from the input lines as shown below. Can you also use the head command for this purpose? If not, why not?

$ printf 'apple\nbanana\ncherry\nfig\n' | ##### add your solution here
app
ban
che
fig

6) Display only the first and third fields of the scores.csv input file, with tab as the output field separator.

$ cat scores.csv
Name,Maths,Physics,Chemistry
Ith,100,100,100
Cy,97,98,95
Lin,78,83,80

##### add your solution here
Name    Physics
Ith     100
Cy      98
Lin     83

7) The given input data uses one or more : characters as the field separator. Assume that no field content will have the : character. Display except the second field, with : as the output field separator.

$ cat books.txt
Cradle:::Mage Errant::The Weirkey Chronicles
Mother of Learning::Eight:::::Dear Spellbook:Ascendant
Mark of the Fool:Super Powereds:::Ends of Magic

##### add your solution here
Cradle : The Weirkey Chronicles
Mother of Learning : Dear Spellbook : Ascendant
Mark of the Fool : Ends of Magic

8) Which option would you use to not display lines that do not contain the input delimiter character?

9) Modify the command to get the expected output shown below.

$ printf 'apple\nbanana\ncherry\n' | cut -c-3 --output-delimiter=:
app
ban
che

$ printf 'apple\nbanana\ncherry\n' | ##### add your solution here
a:p:p
b:a:n
c:h:e

10) Figure out the logic based on the given input and output data.

$ printf 'apple\0fig\0carpet\0jeep\0' | ##### add your solution here | cat -v
ple^@g^@rpet^@ep^@