tr

tr helps you to map one set of characters to another set of characters. Features like range, repeats, character sets, squeeze, complement, etc makes it a must know text processing tool.

To be precise, tr can handle only bytes. Multibyte character processing isn't supported yet.

Transliteration

Here are some examples that map one set of characters to another. As a good practice, always enclose the sets in single quotes to avoid issues due to shell metacharacters.

# 'l' maps to '1', 'e' to '3', 't' to '7' and 's' to '5'
$ echo 'leet speak' | tr 'lets' '1375'
1337 5p3ak

# example with shell metacharacters
$ echo 'apple;banana;cherry' | tr ; :
tr: missing operand
Try 'tr --help' for more information.
$ echo 'apple;banana;cherry' | tr ';' ':'
apple:banana:cherry

You can use - between two characters to construct a range (ascending order only).

# uppercase to lowercase
$ echo 'HELLO WORLD' | tr 'A-Z' 'a-z'
hello world

# swap case
$ echo 'Hello World' | tr 'a-zA-Z' 'A-Za-z'
hELLO wORLD

# rot13
$ echo 'Hello World' | tr 'a-zA-Z' 'n-za-mN-ZA-M'
Uryyb Jbeyq
$ echo 'Uryyb Jbeyq' | tr 'a-zA-Z' 'n-za-mN-ZA-M'
Hello World

tr works only on stdin data, so use shell input redirection for file inputs.

$ tr 'a-z' 'A-Z' <greeting.txt
HI THERE
HAVE A NICE DAY

Different length sets

If the second set is longer, the extra characters are simply ignored. If the first set is longer, the last character of the second set is reused for the missing mappings.

# only abc gets converted to uppercase
$ echo 'apple banana cherry' | tr 'abc' 'A-Z'
Apple BAnAnA Cherry

# c-z will be converted to C
$ echo 'apple banana cherry' | tr 'a-z' 'ABC'
ACCCC BACACA CCCCCC

You can use the -t option to truncate the first set so that it matches the length of the second set.

# d-z won't be converted
$ echo 'apple banana cherry' | tr -t 'a-z' 'ABC'
Apple BAnAnA Cherry

You can also use [c*n] notation to repeat a character c by n times. You can specify n in decimal format or octal format (starts with 0). If n is omitted, the character c is repeated as many times as needed to equalize the length of the sets.

# a-e will be translated to A
# f-z will be uppercased
$ echo 'apple banana cherry' | tr 'a-z' '[A*5]F-Z'
APPLA AANANA AHARRY

# a-c and x-z will be uppercased
# rest of the characters will be translated to -
$ echo 'apple banana cherry' | tr 'a-z' 'ABC[-*]XYZ'
A---- BA-A-A C----Y

Escape sequences and character sets

Certain characters like newline, tab, etc can be represented using escape sequences. You can also specify characters using the \NNN octal representation.

# same as: tr '\011' '\072'
$ printf 'apple\tbanana\tcherry\n' | tr '\t' ':'
apple:banana:cherry

$ echo 'apple:banana:cherry' | tr ':' '\n'
apple
banana
cherry

Certain commonly useful groups of characters like alphabets, digits, punctuation, etc have named character sets that you can use instead of manually creating the sets. Only [:lower:] and [:upper:] can be used by default, others will require -d or -s options.

# same as: tr 'a-z' 'A-Z' <greeting.txt
$ tr '[:lower:]' '[:upper:]' <greeting.txt
HI THERE
HAVE A NICE DAY

To override the special meaning for - and \ characters, you can escape them using the \ character. You can also place the - character at the end of a set to represent it literally. Can you reason out why placing the - character at the start of a set can cause issues?

$ echo '/python-projects/programs' | tr '/-' '\\_'
\python_projects\programs

info See the tr manual for more details and a list of all the escape sequences and character sets.

Deleting characters

Use the -d option to specify a set of characters to be deleted.

$ echo '2024-08-12' | tr -d '-'
20240812

# delete all punctuation characters
$ s='"Hi", there! How *are* you? All fine here.'
$ echo "$s" | tr -d '[:punct:]'
Hi there How are you All fine here

Complement

The -c option will invert the first set of characters. This is often used in combination with the -d option.

$ s='"Hi", there! How *are* you? All fine here.'

# retain alphabets, whitespaces, period, exclamation and question mark
$ echo "$s" | tr -cd 'a-zA-Z.!?[:space:]'
Hi there! How are you? All fine here.

If you use -c for transliteration, you can only provide a single character for the second set. In other words, all the characters except those provided by the first set will be mapped to the character specified by the second set.

$ s='"Hi", there! How *are* you? All fine here.'

$ echo "$s" | tr -c 'a-zA-Z.!?[:space:]' '1%'
tr: when translating with complemented character classes,
string2 must map all characters in the domain to one

$ echo "$s" | tr -c 'a-zA-Z.!?[:space:]' '%'
%Hi%% there! How %are% you? All fine here.

Squeeze

The -s option changes consecutive repeated characters to a single copy of that character.

# squeeze lowercase alphabets
$ echo 'HELLO... hhoowwww aaaaaareeeeee yyouuuu!!' | tr -s 'a-z'
HELLO... how are you!!

# translate and squeeze
$ echo 'hhoowwww aaaaaareeeeee yyouuuu!!' | tr -s 'a-z' 'A-Z'
HOW ARE YOU!!

# delete and squeeze
$ echo 'hhoowwww aaaaaareeeeee yyouuuu!!' | tr -sd '!' 'a-z'
how are you

# squeeze other than lowercase alphabets
$ echo 'apple    noon     banana!!!!!' | tr -cs 'a-z'
apple noon banana!

Exercises

info The exercises directory has all the files used in this section.

1) What's wrong with the following command?

$ echo 'apple#banana#cherry' | tr # :

2) Retain only alphabets, digits and whitespace characters.

$ printf 'Apple_42  cool,blue\tDragon:army\n' | ##### add your solution here
Apple42  coolblue       Dragonarmy

3) Similar to rot13, figure out a way to shift digits such that the same logic can be used both ways.

$ echo '4780 89073' | ##### add your solution here
9235 34528

$ echo '9235 34528' | ##### add your solution here
4780 89073

4) Figure out the logic based on the given input and output data. Hint: use two ranges for the first set and only 6 characters in the second set.

$ echo 'apple banana cherry damson etrog' | ##### add your solution here
1XXl5 21n1n1 3h5XXX 41mXon 5XXog

5) Which option would you use to truncate the first set so that it matches the length of the second set?

6) What does the * notation do in the second set?

7) Change : to - and ; to the newline character.

$ echo 'tea:coffee;brown:teal;dragon:unicorn' | ##### add your solution here
tea-coffee
brown-teal
dragon-unicorn

8) Convert all characters to * except digit and newline characters.

$ echo 'ajsd45_sdg2Khnf4v_54as' | ##### add your solution here
****45****2****4**54**

9) Change consecutive repeated punctuation characters to a single punctuation character.

$ echo '""hi..."", good morning!!!!' | ##### add your solution here
"hi.", good morning!

10) Figure out the logic based on the given input and output data.

$ echo 'Aapple    noon     banana!!!!!' | ##### add your solution here
:apple:noon:banana:

11) The books.txt file has items separated by one or more : characters. Change this separator to a single newline character as shown below.

$ cat books.txt
Cradle:::Mage Errant::The Weirkey Chronicles
Mother of Learning::Eight:::::Dear Spellbook:Ascendant
Mark of the Fool:Super Powereds:::Ends of Magic

##### add your solution here
Cradle
Mage Errant
The Weirkey Chronicles
Mother of Learning
Eight
Dear Spellbook
Ascendant
Mark of the Fool
Super Powereds
Ends of Magic