Introduction

Quoting from wikipedia:

grep is a command-line utility for searching plain-text data sets for lines that match a regular expression. Its name comes from the ed command g/re/p (global / regular expression search / and print), which has the same effect.

Use of grep has become so ubiquitous that it has found its way into the Oxford dictionary as well. As part of everyday computer usage, the need to search comes up often. It could be finding the right emoji by name on social media, searching your browser bookmarks, locating a particular function in a programming file and so on. Some of these tools have options for refining a search further, like controlling case sensitivity, restricting matches to whole words, using regular expressions, etc.

grep provides all of the above features and much more when it comes to searching and extracting content from text files. After getting used to grep, the search features provided by GUI programs feel slower and inadequate.

Installation

If you are on a Unix-like system, you will most likely have some version of grep already installed. This book is primarily about GNU grep and also has a chapter on ripgrep. As there are syntax and feature differences between various implementations, make sure to have these particular commands to follow along the examples presented in this book.

GNU grep is part of the text creation and manipulation tools and comes by default on GNU/Linux distributions. To install a particular version, visit gnu: grep software. See also release notes for an overview of changes between versions and bug list if you think some command isn't working as expected.

Sample instructions for compiling the latest version are shown below. You might need to install a PCRE library first, for example sudo apt install libpcre2-dev.

$ wget https://ftp.gnu.org/gnu/grep/grep-3.10.tar.xz
$ tar -xf grep-3.10.tar.xz
$ cd grep-3.10/
# see https://askubuntu.com/q/237576 if you get compiler not found error
$ ./configure
$ make
$ sudo make install

$ grep -V | head -n1
grep (GNU grep) 3.10

If you are not using a Linux distribution, you may be able to access GNU grep using an option below:

  • Git for Windows — provides a Bash emulation used to run Git from the command line
  • Windows Subsystem for Linux — compatibility layer for running Linux binary executables natively on Windows
  • brew — Package Manager for macOS (or Linux)

Options overview

It is always good to know where to find documentation. From the command line, you can use man grep for a short manual and info grep for the full documentation. I prefer using the online gnu grep manual, which feels much easier to use and navigate.

$ man grep
NAME
       grep - print lines that match patterns

SYNOPSIS
       grep [OPTION...] PATTERNS [FILE...]
       grep [OPTION...] -e PATTERNS ... [FILE...]
       grep [OPTION...] -f PATTERN_FILE ... [FILE...]

DESCRIPTION
       grep searches for PATTERNS in each FILE.  PATTERNS is one or more
       patterns separated by newline characters, and  grep  prints  each
       line that matches a pattern.  Typically PATTERNS should be quoted
       when grep is used in a shell command.

       A FILE of “-” stands for standard input.  If no  FILE  is  given,
       recursive   searches   examine   the   working   directory,   and
       nonrecursive searches read standard input.

For a quick overview of all the available options, use grep --help from the command line. These are shown below in table format:

Regexp selection:

OptionDescription
-E, --extended-regexpPATTERNS are extended regular expressions
-F, --fixed-stringsPATTERNS are strings
-G, --basic-regexpPATTERNS are basic regular expressions
-P, --perl-regexpPATTERNS are Perl regular expressions
-e, --regexp=PATTERNSuse PATTERNS for matching
-f, --file=FILEtake PATTERNS from FILE
-i, --ignore-caseignore case distinctions in patterns and data
--no-ignore-casedo not ignore case distinctions (default)
-w, --word-regexpmatch only whole words
-x, --line-regexpmatch only whole lines
-z, --null-dataa data line ends in 0 byte, not newline

Miscellaneous:

OptionDescription
-s, --no-messagessuppress error messages
-v, --invert-matchselect non-matching lines
-V, --versiondisplay version information and exit
--helpdisplay this help text and exit

Output control:

OptionDescription
-m, --max-count=NUMstop after NUM selected lines
-b, --byte-offsetprint the byte offset with output lines
-n, --line-numberprint line number with output lines
--line-bufferedflush output on every line
-H, --with-filenameprint file name with output lines
-h, --no-filenamesuppress the file name prefix on output
--label=LABELuse LABEL as the standard input file name prefix
-o, --only-matchingshow only nonempty parts of lines that match
-q, --quiet, --silentsuppress all normal output
--binary-files=TYPEassume that binary files are TYPE;
TYPE is 'binary', 'text', or 'without-match'
-a, --textequivalent to --binary-files=text
-Iequivalent to --binary-files=without-match
-d, --directories=ACTIONhow to handle directories;
ACTION is 'read', 'recurse', or 'skip'
-D, --devices=ACTIONhow to handle devices, FIFOs and sockets;
ACTION is 'read' or 'skip'
-r, --recursivelike --directories=recurse
-R, --dereference-recursivelikewise, but follow all symlinks
--include=GLOBsearch only files that match GLOB (a file pattern)
--exclude=GLOBskip files that match GLOB
--exclude-from=FILEskip files that match any file pattern from FILE
--exclude-dir=GLOBskip directories that match GLOB
-L, --files-without-matchprint only names of FILEs with no selected lines
-l, --files-with-matchesprint only names of FILEs with selected lines
-c, --countprint only a count of selected lines per FILE
-T, --initial-tabmake tabs line up (if needed)
-Z, --nullprint 0 byte after FILE name

Context control:

OptionDescription
-B, --before-context=NUMprint NUM lines of leading context
-A, --after-context=NUMprint NUM lines of trailing context
-C, --context=NUMprint NUM lines of output context
-NUMsame as --context=NUM
--group-separator=SEPprint SEP on line between matches with context
--no-group-separatordo not print separator for matches with context
--color[=WHEN],use markers to highlight the matching strings;
--colour[=WHEN]WHEN is 'always', 'never', or 'auto'
-U, --binarydo not strip CR characters at EOL (MSDOS/Windows)