File Properties

In this chapter, you'll learn how to view file details like line and word counts, file and disk sizes, file types, extract parts of file path, etc. You'll also learn how to change file properties like timestamps and permissions.

info The example_files directory has the scripts and sample input files used in this chapter.


The wc command is typically used to count the number of lines, words and characters for the given input(s). Here are some basic examples:

# change to the 'example_files/text_files' directory
$ cat greeting.txt
Hi there
Have a nice day

# by default, gives newline/word/byte count (in that order)
$ wc greeting.txt
 2  6 25 greeting.txt

# get only the specified counts
$ wc -l greeting.txt
2 greeting.txt
$ wc -w greeting.txt
6 greeting.txt
$ wc -c greeting.txt
25 greeting.txt
$ wc -wc greeting.txt
 6 25 greeting.txt

Filename won't be printed for stdin data. This is helpful to save the results in a variable for scripting purposes.

$ wc -l <greeting.txt

Word count is based on whitespace separation. You can pre-process the input to prevent certain non-whitespace characters to influence the results. tr can be used to remove a particular set of characters (this command will be discussed in the Assorted Text Processing Tools chapter).

$ echo 'apple ; banana ; cherry' | wc -w

# remove characters other than alphabets and whitespace
# -d option is for deleting, -c option complements the given set
$ echo 'apple ; banana ; cherry' | tr -cd 'a-zA-Z[:space:]'
apple  banana  cherry
$ echo 'apple ; banana ; cherry' | tr -cd 'a-zA-Z[:space:]' | wc -w

If you pass multiple files to the wc command, the count values will be displayed separately for each file. You'll also get a summary at the end, which sums the respective count of all the input files.

$ wc greeting.txt fruits.txt sample.txt
  2   6  25 greeting.txt
  3   3  20 fruits.txt
 15  38 183 sample.txt
 20  47 228 total

You can use the -L option to report the length of the longest line in the input (excluding the newline character of a line). Note that -L won't count non-printable characters and tabs are converted to equivalent spaces. Multibyte characters and grapheme clusters will each be counted as 1 (depending on the locale, they might become non-printable too).

$ echo 'apple' | wc -L

$ echo 'αλεπού cag̈e' | wc -L

$ wc -L <greeting.txt

Use -m option instead of -c if the input has multibyte characters.

$ printf 'αλεπού' | wc -c

$ printf 'αλεπού' | wc -m


The du command helps you estimate the size of files and directories.

By default, size is given in size in terms of 1024 bytes. All directories and sub-directories are recursively reported, but files are ignored. You can use the -a option if files should also be reported. du is one of the commands that require an explicit option (-L in this case) if you want symbolic links to be followed.

# change to the 'scripts' directory and source the '' script
$ source

# n * 1024 bytes
$ du
28      ./projects/scripts
48      ./projects
8       ./todos
7536    .

Use -s option to show the total directory size without descending into sub-directories. Add -c option to also show total size at the end.

$ du -s projects report.log
48      projects
7476    report.log

$ du -sc projects report.log
48      projects
7476    report.log
7524    total

Here are some examples to illustrate size formatting options:

# number of bytes
$ du -b report.log
7654321 report.log

# n * 1024 bytes
$ du -k report.log
7476    report.log

# n * 1024 * 1024 bytes
$ du -m report.log
8       report.log

The -h option reports size in human readable format (uses power of 1024). Use --si option to get results in powers of 1000 instead. If you use du -h, you can pipe the output to sort -h for sorting purposes.

$ du -sh *
48K     projects
7.4M    report.log
8.0K    todos

$ du -sh * | sort -h
8.0K    todos
48K     projects
7.4M    report.log


The df command gives you the space usage of file systems. df without path arguments will give information about all the currently mounted file systems. You can specify . to get information only for the current filesystem:

$ df .
Filesystem     1K-blocks     Used Available Use% Mounted on
/dev/sda1       98298500 58563816  34734748  63% /

Use -h option for human readable sizes. The -B option allows you to scale sizes by the specified amount. Use --si for size in powers of 1000 instead of 1024.

$ df -h .
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda1        94G   56G   34G  63% /

Use the --output option to report only specific fields of interest:

$ df -h --output=size,used,file / /media/learnbyexample/projs
 Size  Used File
  94G   56G /
  92G   35G /media/learnbyexample/projs

# 'awk' here excludes first line and matches lines with first field >= 30
$ df -h --output=pcent,fstype,target | awk 'NR>1 && $1>=30'
 63% ext3     /
 38% ext4     /media/learnbyexample/projs
 51% ext4     /media/learnbyexample/backups


The stat command is useful to get details like file type, size, inode, permissions, last accessed and modified timestamps, etc. You'll get all of these details by default. The -c and --printf options can be used to display only the required details in a particular format.

# change to the 'scripts' directory and source the '' script
$ source

# %x gives last accessed timestamp
$ stat -c '%x' ip.txt
2022-06-01 13:25:18.693823117 +0530

# %y gives last modified timestamp
$ stat -c '%y' ip.txt
2022-05-24 14:39:41.285714934 +0530

# %s gives file size in bytes
# \n is used to get a newline
# %i gives the inode value
# same as: stat --printf='%s\n%i\n' ip.txt
$ stat -c $'%s\n%i' ip.txt

# %N gives quoted filenames
# if input is a link, path it points to is also displayed
$ stat -c '%N' words.txt
'words.txt' -> '/usr/share/dict/words'

You can also pass multiple file arguments:

# %s gives file size in bytes
# %n gives filenames
$ stat -c '%s %n' ip.txt
10 ip.txt

info warning The stat command should be preferred instead of parsing ls -l output for file details. See mywiki.wooledge: avoid parsing output of ls and unix.stackexchange: why not parse ls? for explanation and other alternatives.


As mentioned earlier, the touch command helps you change the timestamps of files. You can do so based on current timestamp, passing an argument, copying the value from another file and so on.

By default, touch updates both access and modification timestamps to the current time. You can use -a to change only access timestamp and -m to change only modification timestamp.

# change to the 'scripts' directory and source the '' script
$ source

# last access and modification timestamps
$ stat -c $'%x\n%y' fruits.txt
2017-07-19 17:06:01.523308599 +0530
2017-07-13 13:54:03.576055933 +0530

# update access and modification values to the current time
$ touch fruits.txt 
$ stat -c $'%x\n%y' fruits.txt
2022-06-14 13:01:25.921205889 +0530
2022-06-14 13:01:25.921205889 +0530

You can use the -r option to copy timestamp information from one file to another. The -d and -t options will allow you to specify timestamps directly as part of the command.

$ stat -c '%y'
2022-06-14 13:00:46.170416890 +0530

# copy modified timestamp from 'ip.txt' to ''
$ touch -m -r ip.txt
$ stat -c '%y'
2022-05-24 14:39:41.285714934 +0530

# pass timestamp as an argument
$ touch -m -d '2000-01-01 00:00:01'
$ stat -c '%y'
2000-01-01 00:00:01.000000000 +0530

As seen in the Managing Files and Directories chapter, touch creates a new file if the target file doesn't exist yet. You can use the -c option to prevent this behavior.

$ ls report.txt
ls: cannot access 'report.txt': No such file or directory
$ touch report.txt
$ ls report.txt

$ touch -c xyz.txt
$ ls xyz.txt
ls: cannot access 'xyz.txt': No such file or directory


The file command helps you identify text encoding (ASCII, UTF-8, etc), whether the file is executable and so on.

Here are some examples to show how the file command behaves for different types:

# change to the 'scripts' directory and source the '' script
$ source
$ ls -F*  ip.txt  moon.png  sunrise.jpg

$ file ip.txt
ip.txt: ASCII text Bourne-Again shell script, ASCII text executable

$ printf 'αλεπού\n' | file -
/dev/stdin: UTF-8 Unicode text

$ printf 'hi\r\n' | file -
/dev/stdin: ASCII text, with CRLF line terminators

Example for image files:

# output of 'sunrise.jpg' wrapped for illustration purposes
$ file sunrise.jpg moon.png
sunrise.jpg: JPEG image data, JFIF standard 1.01, resolution (DPI), density
    96x96, segment length 16, baseline, precision 8, 76x76, components 3
moon.png:    PNG image data, 76 x 76, 8-bit colormap, non-interlaced

You can use the -b option to avoid filenames in the output:

$ file -b ip.txt 
ASCII text

Here is an example of finding particular type of files, say image files.

# assuming filenames do not contain ':' or newline characters
# awk here helps to print the first field of lines containing 'image data'
$ find -type f -exec file {} + | awk -F: '/\<image data\>/{print $1}'

info See also identify command which "describes the format and characteristics of one or more image files".


By default, the basename command will remove the leading directory component from the given path argument. Any trailing slashes will be removed before determining the portion to be extracted.

$ basename /home/learnbyexample/example_files/scores.csv

# quote the arguments as needed
$ basename 'path with spaces/report.log'

You can use the -s option to remove a suffix from the filename. Usually used to remove the file extension.

$ basename -s'.csv' /home/learnbyexample/example_files/scores.csv

# suffix will be removed only once
$ basename -s'.txt' purchases.txt.txt

The basename command requires -a or -s (which implies -a) to work with multiple arguments.

$ basename -a /backups/jan_2021.tar.gz /home/learnbyexample/report.log

# -a is implied when -s is used
$ basename -s'.txt' logs/purchases.txt logs/report.txt


By default, the dirname command removes the trailing path component (after removing any trailing slashes).

$ dirname /home/learnbyexample/example_files/scores.csv

# one or more trailing slashes will not affect the output
$ dirname /home/learnbyexample/example_files/

# unlike basename, multiple arguments are accepted by default
$ dirname /home/learnbyexample/example_files/scores.csv ../report/backups/

You can use shell features like command substitution to combine the effects of basename and dirname commands.

# extract the second last path component
$ basename $(dirname /home/learnbyexample/example_files/scores.csv)


You can use the chmod command to change file and directory permissions. Consider this example:

$ mkdir practice_chmod
$ cd practice_chmod
$ echo 'learnbyexample' > ip.txt

# this info can also be seen in the first column of 'ls -l' output
$ stat -c '%A' ip.txt

In the above output, the 10 characters displayed in the last line are related to file type and permissions. First character indicates the file type. The most common ones are shown below:

  • - regular file
  • d directory
  • l symbolic link

The other nine characters represent three sets of file permissions for user (u), group (g) and others (o), in that order.

  • user — file owner
  • group — users having file access as part of a group
  • others — everyone else

Only rwx file properties will be discussed in this section. For other types of properties, refer to the coreutils manual: File permissions.

Permission reference table for files:

-no permission0

Here's an example showing both rwx and numerical representations of a file's permissions:

$ stat -c '%A' ip.txt

# r(4) + w(2) + 0 = 6
# r(4) + 0 + 0 = 4
$ stat -c '%a' ip.txt

info Note that the permissions are not straightforward to understand for directories. If a directory only has the x permission, you can cd into it but you cannot read the contents (using ls for example). If a directory only has the r permission, you cannot cd into it, but you'll be able to read the contents (along with "cannot access" error). For this reason, rx permissions are almost always enabled/disabled together. The w permission allows you to add or remove contents, provided x is active.

Changing permissions for all three categories

You can provide numbers for ugo (in that order) to change permissions. This is best understood with examples:

$ printf '#!/bin/bash\n\necho hi\n' >
$ stat -c '%a %A'
664 -rw-rw-r--

# r(4) + w(2) + x(1) = 7
# r(4) + 0 + x(1) = 5
$ chmod 755
$ stat -c '%a %A'
755 -rwxr-xr-x

Here's an example for a directory:

$ mkdir dot_files
$ stat -c '%a %A' dot_files
775 drwxrwxr-x

$ chmod 700 dot_files
$ stat -c '%a %A' dot_files
700 drwx------

You can also use mkdir -m instead of the mkdir+chmod combination seen above. The argument to the -m option accepts the same syntax as chmod (including the format that'll be discussed next).

$ mkdir -m 750 backups
$ stat -c '%a %A' backups
750 drwxr-x---

info You can use chmod -R to recursively change permissions. Use find+exec if you want to apply changes only for files filtered by some criteria.

Changing permissions for specific categories

You can assign (=), add (+) or remove (-) permissions by using those symbols followed by one or more rwx permissions. This depends on the umask value:

$ umask

umask value of 0002 means:

  • read and execute permissions without ugo prefix affects all the three categories
  • write permissions without ugo prefix affects only user and group categories

Here are some examples without ugo prefixes:

# remove execute permission for all three categories
$ chmod -x

# add write permission only for 'user' and 'group'
$ chmod +w ip.txt

$ touch sample.txt
$ chmod 702 sample.txt
# give only read permission for all three categories
# write/execute permissions, if any, will be removed
$ chmod =r sample.txt
$ stat -c '%a %A' sample.txt
444 -r--r--r--

# give read and write permissions for 'user' and 'group'
# and read permission for 'others'
# execute permissions, if any, will be removed
$ chmod =rw

Here are some examples with ugo prefixes. You can use a to refer to all the three categories. For example, a+w is same as ugo+w.

# remove read and write permissions only for 'others'
$ chmod o-rw sample.txt

# add execute permission for 'group' and 'others'
$ chmod go+x

# give read and write permissions for all three categories
# execute permissions, if any, will be removed
$ chmod a=rw

You can use , to separate multiple permissions:

# remove execute permission for 'group' and 'others'
# remove write permission for 'others'
$ chmod go-x,o-w

Further Reading


info Use example_files/text_files directory for input files used in the following exercises, unless otherwise specified.

info Create a temporary directory for exercises that may require you to create some files and directories. You can delete such practice directories afterwards.

1) Save the number of lines in the greeting.txt input file to the lines shell variable.

# ???
$ echo "$lines"

2) What do you think will be the output of the following command?

$ echo 'dragons:2 ; unicorns:10' | wc -w

3) Use appropriate options and arguments to get the output shown below.

$ printf 'apple\nbanana\ncherry' | wc # ???
     15     183 sample.txt
      2      19 -
     17     202 total

4) Go through the wc manual and use appropriate options and arguments to get the output shown below.

$ printf 'greeting.txt\0scores.csv' | wc # ???
2 6 25 greeting.txt
4 4 70 scores.csv
6 10 95 total

5) What is the difference between wc -c and wc -m options? And which option would you use to get the longest line length?

6) Find filenames ending with .log and report their sizes in human readable format. Use find+du combination for the first case and ls command (with appropriate shell features) for the second case.

# change to the 'scripts' directory and source the '' script
$ source

# ??? find+du
16K     ./projects/errors.log
7.4M    ./report.log

# ??? ls and shell features
 16K projects/errors.log
7.4M report.log

7) Report sizes of files/directories in the current path in powers of 1000 without descending into sub-directories. Also, show a total at the end.

# change to the 'scripts' directory and source the '' script
$ source

# ???
50k     projects
7.7M    report.log
8.2k    todos
7.8M    total

8) What does the du --apparent-size option do?

9) When will you use the df command instead of du? Which df command option will help you to report only specific fields of interest?

10) Display the size of scores.csv and timings.txt files in the format shown below.

$ stat # ???
scores.csv: 70
timings.txt: 49

11) Which touch option will help you prevent file creation if it doesn't exist yet?

12) Assume new_file.txt doesn't exist in the current working directory. What would be the output of the stat command shown below?

$ touch -t '202010052010.05' new_file.txt
$ stat -c '%y' new_file.txt
# ???

13) Is the following touch command valid? If so, what would be the output of the stat command that follows?

# change to the 'scripts' directory and source the '' script
$ source

$ stat -c '%n: %y' fruits.txt
fruits.txt: 2017-07-13 13:54:03.576055933 +0530

$ touch -r fruits.txt f{1..3}.txt
$ stat -c '%n: %y' f*.txt
# ???

14) Use appropriate option(s) to get the output shown below.

$ printf 'αλεπού\n' | file -
/dev/stdin: UTF-8 Unicode text

$ printf 'αλεπού\n' | file # ???
UTF-8 Unicode text

15) Is the following command valid? If so, what would be the output?

$ basename -s.txt ~///test.txt///
# ???

16) Given the file path in the shell variable p, how'd you obtain the output shown below?

$ p='~/projects/square_tictactoe/python/'
$ dirname # ???

17) Explain what each of the characters mean in the following stat command's output.

$ stat -c '%A' ../scripts/

18) What would be the output of the second stat command shown below?

$ touch new_file.txt
$ stat -c '%a %A' new_file.txt
664 -rw-rw-r--

$ chmod 546 new_file.txt
$ stat -c '%a %A' new_file.txt
# ???

19) How would you specify directory permissions using the mkdir command?

# instead of this
$ mkdir back_up
$ chmod 750 back_up
$ stat -c '%a %A' back_up
750 drwxr-x---
$ rm -r back_up

# do this
$ mkdir # ???
$ stat -c '%a %A' back_up
750 drwxr-x---

20) Change the file permission of book_list.txt to match the output of the second stat command shown below. Don't use the number 220, specify the changes in terms of rwx characters.

$ touch book_list.txt
$ stat -c '%a %A' book_list.txt
664 -rw-rw-r--

# ???
$ stat -c '%a %A' book_list.txt
220 --w--w----

21) Change the permissions of test_dir to match the output of the second stat command shown below. Don't use the number 757, specify the changes in terms of rwx characters.

$ mkdir test_dir
$ stat -c '%a %A' test_dir
775 drwxrwxr-x

# ???
$ stat -c '%a %A' test_dir
757 drwxr-xrwx