Preface
When it comes to command line text processing, the three major pillars are grep
for filtering, sed
for substitution and awk
for field processing. These tools have overlapping features too, for example, all three of them have extensive filtering capabilities.
Unlike grep
and sed
, awk
is a programming language. However, this book intends to showcase awk
one-liners that can be composed from the command line instead of focusing on larger scripts.
This book heavily leans on examples to present features one by one. Regular expressions will also be discussed in detail.
It is recommended that you manually type each example. Make an effort to understand the sample input as well as the solution presented and check if the output changes (or not!) when you alter some part of the input and the command. As an analogy, consider learning to drive a car — no matter how much you read about them or listen to explanations, you'd need practical experience to become proficient.
Prerequisites
You should be familiar with command line usage in a Unix-like environment. You should also be comfortable with concepts like file redirection and command pipelines. Knowing the basics of the grep
and sed
commands will be handy in understanding the filtering and substitution features of awk
.
As awk
is a programming language, you are also expected to be familiar with concepts like variables, printing, functions, control structures, arrays and so on.
If you are new to the world of the command line, check out my Computing from the Command Line ebook and curated resources on Linux CLI and Shell scripting before starting this book.
Conventions
- The examples presented here have been tested with GNU awk version 5.2.2 and includes features not available in earlier versions.
- Code snippets are copy pasted from the
GNU bash
shell and modified for presentation purposes. Some commands are preceded by comments to provide context and explanations. Blank lines to improve readability, onlyreal
time shown for speed comparisons, output skipped for commands likewget
and so on. - Unless otherwise noted, all examples and explanations are meant for ASCII input.
awk
would meanGNU awk
,sed
would meanGNU sed
,grep
would meanGNU grep
and so on unless otherwise specified.- External links are provided throughout the book for you to explore certain topics in more depth.
- The learn_gnuawk repo has all the code snippets and files used in examples, exercises and other details related to the book. If you are not familiar with the
git
command, click the Code button on the webpage to get the files.
Acknowledgements
- GNU awk documentation — manual and examples
- stackoverflow and unix.stackexchange — for getting answers to pertinent questions on
awk
and related commands - tex.stackexchange — for help on pandoc and
tex
related questions - /r/commandline/, /r/linux4noobs/, /r/linuxquestions/ and /r/linux/ — helpful forums
- canva — cover image
- oxipng, pngquant and svgcleaner — optimizing images
- Warning and Info icons by Amada44 under public domain
- arifmahmudrana for spotting an ambiguous explanation
- Pound-Hash for critical feedback
- mdBook — for web version of the book that you are currently reading
- mdBook-pagetoc — for adding table of contents for each chapter
- minify-html — for minifying html files
Special thanks to all my friends and online acquaintances for their help, support and encouragement, especially during these difficult times.
Feedback and Errata
I would highly appreciate it if you'd let me know how you felt about this book. It could be anything from a simple thank you, pointing out a typo, mistakes in code snippets, which aspects of the book worked for you (or didn't!) and so on. Reader feedback is essential and especially so for self-published authors.
You can reach me via:
- Issue Manager: https://github.com/learnbyexample/learn_gnuawk/issues
- E-mail: learnbyexample.net@gmail.com
- Twitter: https://twitter.com/learn_byexample
Author info
Sundeep Agarwal is a lazy being who prefers to work just enough to support his modest lifestyle. He accumulated vast wealth working as a Design Engineer at Analog Devices and retired from the corporate world at the ripe age of twenty-eight. Unfortunately, he squandered his savings within a few years and had to scramble trying to earn a living. Against all odds, selling programming ebooks saved his lazy self from having to look for a job again. He can now afford all the fantasy ebooks he wants to read and spends unhealthy amount of time browsing the internet.
When the creative muse strikes, he can be found working on yet another programming ebook (which invariably ends up having at least one example with regular expressions). Researching materials for his ebooks and everyday social media usage drowned his bookmarks, so he maintains curated resource lists for sanity sake. He is thankful for free learning resources and open source tools. His own contributions can be found at https://github.com/learnbyexample.
List of books: https://learnbyexample.github.io/books/
License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Code snippets are available under MIT License.
Resources mentioned in Acknowledgements section are available under original licenses.
Book version
2.0
See Version_changes.md to track changes across book versions.