One-liner introduction
This chapter will give an overview of Ruby syntax for command line usage. You'll see examples to understand what kind of problems are typically suited for one-liners.
Why use Ruby for one-liners?
I'll assume that you are already familiar with use cases where the command line is more productive compared to GUI. See also this series of articles titled Unix as IDE.
A shell utility like Bash provides built-in commands and scripting features to easily solve and automate various tasks. External commands like grep
, sed
, awk
, sort
, find
, parallel
, etc help to solve a wide variety of text processing tasks. These tools are often combined to work together along with shell features like pipelines, wildcards and loops. You can use Ruby as an alternative to such external tools and also complement them for some use cases.
Here are some sample text processing tasks that you can solve using Ruby one-liners. Options and related details will be explained later.
# retain only the first copy of duplicated lines
ruby -e 'puts readlines.uniq' *.txt
# retain only the first copy of duplicated lines,
# using the second field as the comparison criteria
ruby -e 'puts readlines.uniq {_1.split[1]}' *.txt
# extract only URLs
# uses a third-party library CommonRegexRuby
ruby -rcommonregex -ne 'puts CommonRegex.get_links($_)' *.md
Here are some questions that I've answered with simpler Ruby solution compared to other CLI tools:
- stackoverflow: merge duplicate key values while preserving order
- unix.stackexchange: pair each line of file
The selling point of Ruby over tools like grep
, sed
and awk
includes feature rich regular expression engine and standard/third-party modules. Another advantage is that Ruby is more portable, given the many differences between GNU, BSD and other such implementations. The main disadvantage is that Ruby is likely to be verbose and slower for features that are supported out of the box by those tools.
Installation and Documentation
See ruby-lang.org for instructions on installing Ruby.
Visit ruby-doc.org for documentation.
Command line options
Use ruby -h
to get a list of command line options, along with a brief description.
Option | Description |
---|---|
-0[octal] | specify record separator (\0 , if no argument) |
-a | autosplit mode with -n or -p (splits $_ into $F ) |
-c | check syntax only |
-Cdirectory | cd to directory before executing your script |
-d | set debugging flags (set $DEBUG to true) |
-e 'command' | one line of script. Several -e 's allowed. Omit [programfile] |
-Eex[:in] | specify the default external and internal character encodings |
-Fpattern | split() pattern for autosplit (-a ) |
-i[extension] | edit ARGV files in place (make backup if extension supplied) |
-Idirectory | specify $LOAD_PATH directory (may be used more than once) |
-l | enable line ending processing |
-n | assume 'while gets(); ... end' loop around your script |
-p | assume loop like -n but print line also like sed |
-rlibrary | require the library before executing your script |
-s | enable some switch parsing for switches after script name |
-S | look for the script using PATH environment variable |
-v | print the version number, then turn on verbose mode |
-w | turn warnings on for your script |
-W[level=2|:category] | set warning level; 0=silence, 1=medium, 2=verbose |
-x[directory] | strip off text before #!ruby line and perhaps cd to directory |
--jit | enable JIT for the platform, same as --rjit (experimental) |
--rjit | enable pure-Ruby JIT compiler (experimental) |
-h | show this message, --help for more info |
This chapter will show examples with -e
, -n
, -p
and -a
options. Some more options will be covered in later chapters, but not all of them are discussed in this book.
Executing Ruby code
If you want to execute a Ruby program file, one way is to pass the filename as argument to the ruby
command.
$ echo 'puts "Hello Ruby"' > hello.rb
$ ruby hello.rb
Hello Ruby
For short programs, you can also directly pass the code as an argument to the -e
option.
$ ruby -e 'puts "Hello Ruby"'
Hello Ruby
# multiple statements can be issued separated by ;
$ ruby -e 'x=25; y=12; puts x**y'
59604644775390625
# or use -e option multiple times
$ ruby -e 'x=25' -e 'y=12' -e 'puts x**y'
59604644775390625
Filtering
Ruby one-liners can be used for filtering lines matched by a regular expression (regexp), similar to the grep
, sed
and awk
commands. And similar to many command line utilities, Ruby can accept input from both stdin and file arguments.
# sample stdin data
$ printf 'gate\napple\nwhat\nkite\n'
gate
apple
what
kite
# print lines containing 'at'
# same as: grep 'at' and sed -n '/at/p' and awk '/at/'
$ printf 'gate\napple\nwhat\nkite\n' | ruby -ne 'print if /at/'
gate
what
# print lines NOT containing 'e'
# same as: grep -v 'e' and sed -n '/e/!p' and awk '!/e/'
$ printf 'gate\napple\nwhat\nkite\n' | ruby -ne 'print if !/e/'
what
By default, grep
, sed
and awk
automatically loop over the input content line by line (with newline character as the default line separator). To do so with Ruby, you can use the -n
and -p
options. As seen before, the -e
option accepts code as a command line argument. Many shortcuts are available to reduce the amount of typing needed.
In the above examples, a regular expression (defined by the pattern between a pair of forward slashes) has been used to filter the input. When the input string isn't specified in a conditional context (for example: if
), the test is performed against the global variable $_
, which has the contents of the current input line (the correct term would be input record, as discussed in the Record separators chapter). To summarize, in a conditional context:
/regexp/
is a shortcut for$_ =~ /regexp/
!/regexp/
is a shortcut for$_ !~ /regexp/
$_
is also the default argument for the print
method, which is why it is generally preferred in one-liners over the puts
method. More such defaults that apply to the print
method will be discussed later.
See ruby-doc: Pre-Defined Global Variables for documentation on
$_
,$&
, etc.
Here's an example with file input instead of stdin.
$ cat table.txt
brown bread mat hair 42
blue cake mug shirt -7
yellow banana window shoes 3.14
# same as: grep -oE '[0-9]+$' table.txt
# digits at the end of lines
$ ruby -ne 'puts $& if /\d+$/' table.txt
42
7
14
# digits at the end of lines that are not preceded by -
$ ruby -ne 'puts $& if /(?<!-)\d+$/' table.txt
42
14
The example_files directory has all the files used in the examples (like
table.txt
in the above illustration).
Substitution
Use the sub
and gsub
methods for search and replace requirements. By default, these methods operate on $_
when the input string isn't provided. For these examples, the -p
option is used instead of -n
, so that the value of $_
is automatically printed after processing each input line.
# for each input line, change only the first ':' to '-'
# same as: sed 's/:/-/' and awk '{sub(/:/, "-")} 1'
$ printf '1:2:3:4\na:b:c:d\n' | ruby -pe 'sub(/:/, "-")'
1-2:3:4
a-b:c:d
# for each input line, change all ':' to '-'
# same as: sed 's/:/-/g' and awk '{gsub(/:/, "-")} 1'
$ printf '1:2:3:4\na:b:c:d\n' | ruby -pe 'gsub(/:/, "-")'
1-2-3-4
a-b-c-d
You might wonder how $_
is modified without the use of !
methods. The reason is that these methods are part of Kernel (see ruby-doc: Kernel for details) and are available only when the -n
and -p
options are used.
sub(/regexp/, repl)
is a shortcut for$_.sub(/regexp/, repl)
and$_
will be updated if the substitution succeedsgsub(/regexp/, repl)
is a shortcut for$_.gsub(/regexp/, repl)
and$_
gets updated if the substitution succeeds
This book assumes that you are already familiar with regular expressions. If not, you can check out my free ebook Understanding Ruby Regexp.
Field processing
Consider the sample input file shown below with fields separated by a single space character.
$ cat table.txt
brown bread mat hair 42
blue cake mug shirt -7
yellow banana window shoes 3.14
Here are some examples that are based on specific fields rather than the entire line. The -a
option will cause the input line to be split based on whitespaces and the array contents can be accessed using the $F
global variable. Leading and trailing whitespaces will be suppressed, so there's no possibility of empty fields. More details will be discussed in the Default field separation section.
# print the second field of each input line
# same as: awk '{print $2}' table.txt
$ ruby -ane 'puts $F[1]' table.txt
bread
cake
banana
# print lines only if the last field is a negative number
# same as: awk '$NF<0' table.txt
$ ruby -ane 'print if $F[-1].to_f < 0' table.txt
blue cake mug shirt -7
# change 'b' to 'B' only for the first field
# same as: awk '{gsub(/b/, "B", $1)} 1' table.txt
$ ruby -ane '$F[0].gsub!(/b/, "B"); puts $F * " "' table.txt
Brown bread mat hair 42
Blue cake mug shirt -7
yellow banana window shoes 3.14
BEGIN and END
You can use a BEGIN{}
block when you need to execute something before the input is read and an END{}
block to execute something after all of the input has been processed.
# same as: awk 'BEGIN{print "---"} 1; END{print "%%%"}'
# note the use of ; after the BEGIN block
$ seq 4 | ruby -pe 'BEGIN{puts "---"}; END{puts "%%%"}'
---
1
2
3
4
%%%
ENV hash
When it comes to automation and scripting, you'd often need to construct commands that can accept input from users, use data from files and the output of a shell command and so on. As mentioned before, this book assumes bash
as the shell being used. To access environment variables of the shell, you can use the special hash variable ENV
with the name of the environment variable as a string key.
# existing environment variable
# output shown here is for my machine, would differ for you
$ ruby -e 'puts ENV["HOME"]'
/home/learnbyexample
$ ruby -e 'puts ENV["SHELL"]'
/bin/bash
# defined along with the command
# note that the variable definition is placed before the command
$ word='hello' ruby -e 'puts ENV["word"]'
hello
# the characters are preserved as is
$ ip='hi\nbye' ruby -e 'puts ENV["ip"]'
hi\nbye
Here's another example when a regexp is passed as an environment variable content.
$ cat word_anchors.txt
sub par
spar
apparent effort
two spare computers
cart part tart mart
# assume 'r' is a shell variable containing user provided regexp
$ r='\Bpar\B'
$ rgx="$r" ruby -ne 'print if /#{ENV["rgx"]}/' word_anchors.txt
apparent effort
two spare computers
You can also make use of the -s
option to assign a global variable.
$ r='\Bpar\B'
$ ruby -sne 'print if /#{$rgx}/' -- -rgx="$r" word_anchors.txt
apparent effort
two spare computers
As an example, see my repo ch: command help for a practical shell script, where commands are constructed dynamically.
Executing external commands
You can call external commands using the system
Kernel method. See ruby-doc: system for documentation.
$ ruby -e 'system("echo Hello World")'
Hello World
$ ruby -e 'system("wc -w <word_anchors.txt")'
12
$ ruby -e 'system("seq -s, 10 > out.txt")'
$ cat out.txt
1,2,3,4,5,6,7,8,9,10
Return value of system
or the global variable $?
can be used to act upon the exit status of the command issued.
$ ruby -e 'es=system("ls word_anchors.txt"); puts es'
word_anchors.txt
true
$ ruby -e 'system("ls word_anchors.txt"); puts $?'
word_anchors.txt
pid 6087 exit 0
$ ruby -e 'system("ls xyz.txt"); puts $?'
ls: cannot access 'xyz.txt': No such file or directory
pid 6164 exit 2
To save the result of an external command, use backticks or %x
.
$ ruby -e 'words = `wc -w <word_anchors.txt`; puts words'
12
$ ruby -e 'nums = %x/seq 3/; print nums'
1
2
3
See also stackoverflow: difference between exec, system and %x() or backticks.
Summary
This chapter introduced some of the common options for Ruby CLI usage, along with some of the typical text processing examples. While specific purpose CLI tools like grep
, sed
and awk
are usually faster, Ruby has a much more extensive standard library and ecosystem. And you do not have to learn a lot if you are already comfortable with Ruby but not familiar with those CLI tools. The next section has a few exercises for you to practice the CLI options and text processing use cases.
Exercises
All the exercises are also collated together in one place at Exercises.md. For solutions, see Exercise_solutions.md.
The exercises directory has all the files used in this section.
1) For the input file ip.txt
, display all lines containing is
.
$ cat ip.txt
Hello World
How are you
This game is good
Today is sunny
12345
You are funny
##### add your solution here
This game is good
Today is sunny
2) For the input file ip.txt
, display the first field of lines not containing y
. Consider space as the field separator for this file.
##### add your solution here
Hello
This
12345
3) For the input file ip.txt
, display all lines containing no more than 2 fields.
##### add your solution here
Hello World
12345
4) For the input file ip.txt
, display all lines containing is
in the second field.
##### add your solution here
Today is sunny
5) For each line of the input file ip.txt
, replace the first occurrence of o
with 0
.
##### add your solution here
Hell0 World
H0w are you
This game is g0od
T0day is sunny
12345
Y0u are funny
6) For the input file table.txt
, calculate and display the product of numbers in the last field of each line. Consider space as the field separator for this file.
$ cat table.txt
brown bread mat hair 42
blue cake mug shirt -7
yellow banana window shoes 3.14
##### add your solution here
-923.1600000000001
7) Append .
to all the input lines for the given stdin data.
$ printf 'last\nappend\nstop\ntail\n' | ##### add your solution here
last.
append.
stop.
tail.
8) Use contents of the s
variable to display matching lines from the input file ip.txt
. Assume that s
doesn't have any regexp metacharacters. Construct the solution such that there's at least one word character immediately preceding the contents of the s
variable.
$ s='is'
##### add your solution here
This game is good
9) Use system
to display the contents of the filename present in the second field of the given input line. Consider space as the field separator.
$ s='report.log ip.txt sorted.txt'
$ echo "$s" | ##### add your solution here
Hello World
How are you
This game is good
Today is sunny
12345
You are funny
$ s='power.txt table.txt'
$ echo "$s" | ##### add your solution here
brown bread mat hair 42
blue cake mug shirt -7
yellow banana window shoes 3.14