paste
paste
is typically used to merge two or more files column wise. It also has a handy feature for serializing data.
Concatenating files column wise
Consider these two input files:
$ cat colors_1.txt
Blue
Brown
Orange
Purple
Red
Teal
White
$ cat colors_2.txt
Black
Blue
Green
Orange
Pink
Red
White
By default, paste
adds a tab character between corresponding lines of the input files.
$ paste colors_1.txt colors_2.txt
Blue Black
Brown Blue
Orange Green
Purple Orange
Red Pink
Teal Red
White White
You can use the -d
option to change the delimiter between the columns. The separator is added even if the data has been exhausted for some of the input files. Here are some examples with single character delimiters. Multicharacter separation will be discussed later.
$ seq 4 | paste -d, - <(seq 6 9)
1,6
2,7
3,8
4,9
# quote the delimiter if it is a shell metacharacter
$ paste -d'|' <(seq 3) <(seq 4 5) <(seq 6 8)
1|4|6
2|5|7
3||8
Use an empty string if you don't want any delimiter between the columns. You can also use \0
for this case, but that'd be confusing since it is typically used to mean the NUL character.
# note that the space between -d and empty string is necessary here
$ paste -d '' <(seq 3) <(seq 6 8)
16
27
38
You can pass the same filename multiple times too — they will be treated as if they are separate inputs. This doesn't apply for
stdin
data though, which is a special case as discussed in a later section.
Interleaving lines
By setting the newline character as the delimiter, you'll get interleaved lines.
$ paste -d'\n' <(seq 11 13) <(seq 101 103)
11
101
12
102
13
103
Multiple columns from single input
If you use -
multiple times, paste
will consume a line from stdin
data every time -
is encountered. This is different from using the same filename multiple times, in which case they are treated as separate inputs.
This special case for stdin
data is useful to combine consecutive lines using the given delimiter. Here are some examples to help you understand this feature better:
# two columns
$ seq 10 | paste -d, - -
1,2
3,4
5,6
7,8
9,10
# five columns
$ seq 10 | paste -d: - - - - -
1:2:3:4:5
6:7:8:9:10
# use shell redirection for file input
$ <colors_1.txt paste -d: - - -
Blue:Brown:Orange
Purple:Red:Teal
White::
Here's an example with both stdin
and file arguments:
$ seq 6 | paste - nums.txt -
1 3.14 2
3 42 4
5 1000 6
If you don't want to manually type the number of -
required, you can use this printf
trick:
# the string before %.s is repeated based on the number of arguments
$ printf 'x %.s' a b c
x x x
$ printf -- '- %.s' {1..5}
- - - - -
$ seq 10 | paste -d, $(printf -- '- %.s' {1..5})
1,2,3,4,5
6,7,8,9,10
See this stackoverflow thread for more details about the
printf
solution and other alternatives.
Multicharacter delimiters
The -d
option accepts a list of characters (bytes to be precise) to be used one by one between the different columns. If the number of characters is less than the number of separators required, the characters are reused from the beginning and this cycle repeats until all the columns are done. If the number of characters is greater than the number of separators required, the extra characters are simply discarded.
# , is used between the 1st and 2nd columns
# - is used between the 2nd and 3rd columns
$ paste -d',-' <(seq 3) <(seq 4 6) <(seq 7 9)
1,4-7
2,5-8
3,6-9
# only 3 separators are needed, the rest are discarded
$ paste -d',-:;.[]' <(seq 3) <(seq 4 6) <(seq 7 9) <(seq 10 12)
1,4-7:10
2,5-8:11
3,6-9:12
# 2 characters given, 4 separators needed
# paste will reuse from the start of the list
$ seq 10 | paste -d':,' - - - - -
1:2,3:4,5
6:7,8:9,10
You can use empty files to get multicharacter separation between the columns.
$ paste -d' : ' <(seq 3) /dev/null /dev/null <(seq 4 6)
1 : 4
2 : 5
3 : 6
# create an empty file to avoid typing /dev/null too many times
$ > e
$ paste -d' : - ' <(seq 3) e e <(seq 4 6) e e <(seq 7 9)
1 : 4 - 7
2 : 5 - 8
3 : 6 - 9
Serialize
The -s
option allows you to combine all the input lines from a file into a single line using the given delimiter. paste
will ensure to add a final newline character even if it wasn't present in the input.
# this will give you a trailing comma
# and there won't be a newline character at the end
$ <colors_1.txt tr '\n' ','
Blue,Brown,Orange,Purple,Red,Teal,White,
# paste changes the separator between the lines only
# and there will be a newline character at the end
$ paste -sd, colors_1.txt
Blue,Brown,Orange,Purple,Red,Teal,White
# newline gets added at the end even if not present in the input
$ printf 'apple\nbanana\ncherry' | paste -sd-
apple-banana-cherry
If multiple files are passed, serialization of each file is displayed on separate lines.
$ paste -sd: colors_1.txt colors_2.txt
Blue:Brown:Orange:Purple:Red:Teal:White
Black:Blue:Green:Orange:Pink:Red:White
$ paste -sd, <(seq 3) <(seq 5 9)
1,2,3
5,6,7,8,9
NUL separator
Use the -z
option if you want to use NUL character as the line separator. In this scenario, paste
will ensure to add a final NUL character even if not present in the input.
$ printf 'a\0b\0c\0d\0e\0f\0g\0h' | paste -z -d: - - - - | cat -v
a:b:c:d^@e:f:g:h^@
Exercises
The exercises directory has all the files used in this section.
1) What's the default delimiter character added by the paste
command? Which option would you use to customize this separator?
2) Will the following two commands produce equivalent output? If not, why not?
$ paste -d, <(seq 3) <(printf '%s\n' item_{1..3})
$ printf '%s\n' {1..3},item_{1..3}
3) Combine the two data sources as shown below.
$ printf '1)\n2)\n3)'
1)
2)
3)
$ cat fruits.txt
banana
papaya
mango
##### add your solution here
1)banana
2)papaya
3)mango
4) Interleave the contents of fruits.txt
and books.txt
.
##### add your solution here
banana
Cradle:::Mage Errant::The Weirkey Chronicles
papaya
Mother of Learning::Eight:::::Dear Spellbook:Ascendant
mango
Mark of the Fool:Super Powereds:::Ends of Magic
5) Generate numbers 1
to 9
in two different formats as shown below.
##### add your solution here
1:2:3
4:5:6
7:8:9
##### add your solution here
1 : 4 : 7
2 : 5 : 8
3 : 6 : 9
6) Combine the contents of fruits.txt
and colors.txt
as shown below.
$ cat fruits.txt
banana
papaya
mango
$ cat colors.txt
deep blue
light orange
blue delight
##### add your solution here
banana,deep blue,papaya,light orange,mango,blue delight