Introduction

Quoting from wikipedia:

grep is a command-line utility for searching plain-text data sets for lines that match a regular expression. Its name comes from the ed command g/re/p (global / regular expression search / and print), which has the same effect.

Use of grep has become so ubiquitous that it has found its way into the Oxford dictionary as well. As part of everyday computer usage, the need to search comes up often. It could be finding the right emoji by name on social media, searching your browser bookmarks, locating a particular function in a programming file and so on. Some of these tools have options for refining a search further, like controlling case sensitivity, restricting matches to whole words, using regular expressions, etc.

grep provides all of the above features and much more when it comes to searching and extracting content from text files. After getting used to grep, the search features provided by GUI programs would feel slower and inadequate.

Installation

If you are on a Unix-like system, you will most likely have some version of grep already installed. This book is primarily about GNU grep and also has a chapter on ripgrep. As there are syntax and feature differences between various implementations, make sure to have these particular commands to follow along the examples presented in this book.

GNU grep is part of the text creation and manipulation tools and comes by default on GNU/Linux distributions. To install a particular version, visit gnu: grep software. See also release notes for an overview of changes between versions and bug list if you think some command isn't working as expected.

Sample instructions for compiling the latest version are shown below. You might need to install a PCRE library first, for example sudo apt install libpcre2-dev.

$ wget https://ftp.gnu.org/gnu/grep/grep-3.11.tar.xz
$ tar -xf grep-3.11.tar.xz
$ cd grep-3.11/
# see https://askubuntu.com/q/237576 if you get compiler not found error
$ ./configure
$ make
$ sudo make install

$ grep -V | head -n1
grep (GNU grep) 3.11

If you are not using a Linux distribution, you may be able to access GNU grep using an option below:

Git for Windows — provides a Bash emulation used to run Git from the command line
Windows Subsystem for Linux — compatibility layer for running Linux binary executables natively on Windows
brew — Package Manager for macOS (or Linux)

Options overview

It is always good to know where to find documentation. From the command line, you can use man grep for a short manual and info grep for the full documentation. I prefer using the online gnu grep manual, which feels much easier to use and navigate.

$ man grep
NAME
       grep - print lines that match patterns

SYNOPSIS
       grep [OPTION...] PATTERNS [FILE...]
       grep [OPTION...] -e PATTERNS ... [FILE...]
       grep [OPTION...] -f PATTERN_FILE ... [FILE...]

DESCRIPTION
       grep searches for PATTERNS in each FILE.  PATTERNS is one or more
       patterns separated by newline characters, and  grep  prints  each
       line that matches a pattern.  Typically PATTERNS should be quoted
       when grep is used in a shell command.

       A FILE of “-” stands for standard input.  If no  FILE  is  given,
       recursive   searches   examine   the   working   directory,   and
       nonrecursive searches read standard input.

For a quick overview of all the available options, use grep --help from the command line. They are tabulated below.

Regexp selection:

Option	Description
-E, --extended-regexp	PATTERNS are extended regular expressions
-F, --fixed-strings	PATTERNS are strings
-G, --basic-regexp	PATTERNS are basic regular expressions
-P, --perl-regexp	PATTERNS are Perl regular expressions
-e, --regexp=PATTERNS	use PATTERNS for matching
-f, --file=FILE	take PATTERNS from FILE
-i, --ignore-case	ignore case distinctions in patterns and data
--no-ignore-case	do not ignore case distinctions (default)
-w, --word-regexp	match only whole words
-x, --line-regexp	match only whole lines
-z, --null-data	a data line ends in 0 byte, not newline

Miscellaneous:

Option	Description
-s, --no-messages	suppress error messages
-v, --invert-match	select non-matching lines
-V, --version	display version information and exit
--help	display this help text and exit

Output control:

Option	Description
-m, --max-count=NUM	stop after NUM selected lines
-b, --byte-offset	print the byte offset with output lines
-n, --line-number	print line number with output lines
--line-buffered	flush output on every line
-H, --with-filename	print file name with output lines
-h, --no-filename	suppress the file name prefix on output
--label=LABEL	use LABEL as the standard input file name prefix
-o, --only-matching	show only nonempty parts of lines that match
-q, --quiet, --silent	suppress all normal output
--binary-files=TYPE	assume that binary files are TYPE;
	TYPE is 'binary', 'text', or 'without-match'
-a, --text	equivalent to --binary-files=text
-I	equivalent to --binary-files=without-match
-d, --directories=ACTION	how to handle directories;
	ACTION is 'read', 'recurse', or 'skip'
-D, --devices=ACTION	how to handle devices, FIFOs and sockets;
	ACTION is 'read' or 'skip'
-r, --recursive	like --directories=recurse
-R, --dereference-recursive	likewise, but follow all symlinks
--include=GLOB	search only files that match GLOB (a file pattern)
--exclude=GLOB	skip files that match GLOB
--exclude-from=FILE	skip files that match any file pattern from FILE
--exclude-dir=GLOB	skip directories that match GLOB
-L, --files-without-match	print only names of FILEs with no selected lines
-l, --files-with-matches	print only names of FILEs with selected lines
-c, --count	print only a count of selected lines per FILE
-T, --initial-tab	make tabs line up (if needed)
-Z, --null	print 0 byte after FILE name

Context control:

Option	Description
-B, --before-context=NUM	print NUM lines of leading context
-A, --after-context=NUM	print NUM lines of trailing context
-C, --context=NUM	print NUM lines of output context
-NUM	same as --context=NUM
--group-separator=SEP	print SEP on line between matches with context
--no-group-separator	do not print separator for matches with context
--color[=WHEN],	use markers to highlight the matching strings;
--colour[=WHEN]	WHEN is 'always', 'never', or 'auto'
-U, --binary	do not strip CR characters at EOL (MSDOS/Windows)

CLI text processing with GNU grep and ripgrep

Introduction

Installation

Options overview