Preface
Scripting and automation tasks often need to extract particular portions of text from input data or modify them from one format to another. This book will help you understand Regular Expressions, a mini-programming language for all sorts of text processing needs.
This book heavily leans on examples to present features of regular expressions one by one. It is recommended that you manually type each example and experiment with them. Understanding both the nature of input strings and the output produced is essential. As an analogy, consider learning to drive a car — no matter how much you read about them or listen to explanations, you'd need practical experience to become proficient.
Prerequisites
You should be familiar with programming basics. You should also have a working knowledge of Python syntax and features like string formats, string methods and list comprehensions.
You are also expected to get comfortable with reading manuals, searching online, visiting external links provided for further reading, tinkering with illustrated examples, asking for help when you are stuck and so on. In other words, be proactive and curious instead of just consuming the content passively.
If you have prior experience with a programming language but not Python, see my curated list of learning resources before starting this book.
Conventions
- The examples presented here have been tested with Python version 3.11.1 and includes features not available in earlier versions.
- Code snippets shown are copy pasted from the Python REPL shell and modified for presentation purposes. Some commands are preceded by comments to provide context and explanations. Blank lines have been added to improve readability. Error messages are shortened.
import
statements are skipped after initial use. And so on. - Unless otherwise noted, all examples and explanations are meant for ASCII characters.
- External links are provided throughout the book for you to explore certain topics in more depth.
- The py_regular_expressions repo has all the code snippets and exercises used in the book. Solutions file is also provided. If you are not familiar with the
git
command, click the Code button on the webpage to get the files.
Acknowledgements
- Python documentation — manuals and tutorials
- /r/learnpython/, /r/Python/ and /r/regex/ — helpful forums for beginners and experienced programmers alike
- stackoverflow — for getting answers to pertinent questions on Python and regular expressions
- tex.stackexchange — for help on pandoc and
tex
related questions - canva — cover image
- Warning and Info icons by Amada44 under public domain
- oxipng, pngquant and svgcleaner — optimizing images
- David Cortesi for helpful feedback on both the technical content and grammar issues
- Kye and gmovchan for spotting a typo
- Hugh's email exchanges helped me significantly to improve the presentation of concepts and exercises
- Christopher Patti for reviewing the book, providing feedback and brightening the day with kind words
- Users 73tada, DrBobHope, nlomb and others for feedback in this reddit thread
- mdBook — for web version of the book that you are currently reading
- mdBook-pagetoc — for adding table of contents for each chapter
- minify-html — for minifying html files
Special thanks to Al Sweigart. His Automate the Boring Stuff book was instrumental for me to get started with Python.
Feedback and Errata
I would highly appreciate if you'd let me know how you felt about this book. It could be anything from a simple thank you, pointing out a typo, mistakes in code snippets, which aspects of the book worked for you (or didn't!) and so on. Reader feedback is essential and especially so for self-published authors.
You can reach me via:
- Issue Manager: https://github.com/learnbyexample/py_regular_expressions/issues
- E-mail: learnbyexample.net@gmail.com
- Twitter: https://twitter.com/learn_byexample
Author info
Sundeep Agarwal is a lazy being who prefers to work just enough to support his modest lifestyle. He accumulated vast wealth working as a Design Engineer at Analog Devices and retired from the corporate world at the ripe age of twenty-eight. Unfortunately, he squandered his savings within a few years and had to scramble trying to earn a living. Against all odds, selling programming ebooks saved his lazy self from having to look for a job again. He can now afford all the fantasy ebooks he wants to read and spends unhealthy amount of time browsing the internet.
When the creative muse strikes, he can be found working on yet another programming ebook (which invariably ends up having at least one example with regular expressions). Researching materials for his ebooks and everyday social media usage drowned his bookmarks, so he maintains curated resource lists for sanity sake. He is thankful for free learning resources and open source tools. His own contributions can be found at https://github.com/learnbyexample.
List of books: https://learnbyexample.github.io/books/
License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Code snippets are available under MIT License.
Resources mentioned in Acknowledgements section above are available under original licenses.
Book version
4.1
See Version_changes.md to track changes across book versions.