wc
The wc
command is useful to count the number of lines, words and characters for the given inputs.
Line, word and byte counts
By default, the wc
command reports the number of lines, words and bytes (in that order). The byte count includes the newline characters, so you can use that as a measure of file size as well. Here's an example:
$ cat greeting.txt
Hi there
Have a nice day
$ wc greeting.txt
2 6 25 greeting.txt
Wondering why there are leading spaces in the output? They help in aligning results for multiple files (discussed later).
Individual counts
Instead of the three default values, you can use options to get only the particular counts you are interested in. These options are:
-l
for line count-w
for word count-c
for byte count
$ wc -l greeting.txt
2 greeting.txt
$ wc -w greeting.txt
6 greeting.txt
$ wc -c greeting.txt
25 greeting.txt
$ wc -wc greeting.txt
6 25 greeting.txt
With stdin
data, you'll get only the count value (unless you use -
for stdin
). Useful for assigning the output to shell variables.
$ printf 'hello' | wc -c
5
$ printf 'hello' | wc -c -
5 -
$ lines=$(wc -l <greeting.txt)
$ echo "$lines"
2
Multiple files
If you pass multiple files to the wc
command, the count values will be displayed separately for each file. You'll also get a summary at the end, which sums the respective count of all the input files.
$ wc greeting.txt nums.txt purchases.txt
2 6 25 greeting.txt
3 3 13 nums.txt
8 9 57 purchases.txt
13 18 95 total
$ wc greeting.txt nums.txt purchases.txt | tail -n1
13 18 95 total
$ wc *[ck]*.csv
9 9 101 marks.csv
4 4 70 scores.csv
13 13 171 total
If you have NUL separated filenames (for example, output from find -print0
, grep -lZ
, etc), you can use the --files0-from
option. This option accepts a file containing the NUL separated data (use -
for stdin
).
$ printf 'greeting.txt\0nums.txt' | wc --files0-from=-
2 6 25 greeting.txt
3 3 13 nums.txt
5 9 38 total
Character count
Use the -m
option instead of -c
if the input has multibyte characters.
# byte count
$ printf 'αλεπού' | wc -c
12
# character count
$ printf 'αλεπού' | wc -m
6
Note that the current locale will affect the behavior of the
-m
option.$ printf 'αλεπού' | LC_ALL=C wc -m 12
Longest line length
You can use the -L
option to report the length of the longest line in the input (excluding the newline character of a line).
$ echo 'apple' | wc -L
5
# last line not ending with newline won't be a problem
$ printf 'apple\nbanana' | wc -L
6
$ wc -L sample.txt
26 sample.txt
$ wc -L <sample.txt
26
If multiple files are passed, the last line summary will show the maximum length among the given inputs.
$ wc -L greeting.txt nums.txt purchases.txt
15 greeting.txt
4 nums.txt
14 purchases.txt
15 total
Corner cases
Line count is based on the number of newline characters. So, if the last line of the input doesn't end with the newline character, it won't be counted.
$ printf 'good\nmorning\n' | wc -l
2
$ printf 'good\nmorning' | wc -l
1
$ printf '\n\n\n' | wc -l
3
Word count is based on whitespace separation. You'll have to pre-process the input if you do not want certain non-whitespace characters to influence the results.
$ echo 'apple ; banana ; cherry' | wc -w
5
# remove characters other than alphabets and whitespaces
$ echo 'apple ; banana ; cherry' | tr -cd 'a-zA-Z[:space:]'
apple banana cherry
$ echo 'apple ; banana ; cherry' | tr -cd 'a-zA-Z[:space:]' | wc -w
3
# allow numbers as well
$ echo '2 : apples ;' | tr -cd '[:alnum:][:space:]' | wc -w
2
-L
won't count non-printable characters and tabs are converted to equivalent spaces. Multibyte characters will each be counted as 1
(depending on the locale, they might become non-printable too).
# tab characters can occupy up to 8 columns
$ printf '\t' | wc -L
8
$ printf 'a\tb' | wc -L
9
# example for non-printable character
$ printf 'a\34b' | wc -L
2
# multibyte characters are counted as 1 each in supported locales
$ printf 'αλεπού' | wc -L
6
# non-supported locales can cause them to be treated as non-printable
$ printf 'αλεπού' | LC_ALL=C wc -L
0
-m
and -L
options count grapheme clusters differently.
$ printf 'cag̈e' | wc -m
5
$ printf 'cag̈e' | wc -L
4
Exercises
The exercises directory has all the files used in this section.
1) Save the number of lines in the greeting.txt
input file to the lines
shell variable.
$ lines=##### add your solution here
$ echo "$lines"
2
2) What do you think will be the output of the following command?
$ echo 'dragons:2 ; unicorns:10' | wc -w
3) Use appropriate options and arguments to get the output as shown below. Also, why is the line count showing as 2
instead of 3
for the stdin
data?
$ printf 'apple\nbanana\ncherry' | ##### add your solution here
2 25 greeting.txt
2 19 -
4 44 total
4) Use appropriate options and arguments to get the output shown below.
$ printf 'greeting.txt\0scores.csv' | ##### add your solution here
2 6 25 greeting.txt
4 4 70 scores.csv
6 10 95 total
5) What is the difference between wc -c
and wc -m
options? And which option would you use to get the longest line length?
6) Calculate the number of comma separated words from the scores.csv
file.
$ cat scores.csv
Name,Maths,Physics,Chemistry
Ith,100,100,100
Cy,97,98,95
Lin,78,83,80
##### add your solution here
16