Strings and user input

This chapter will discuss various ways to specify string literals. After that, you'll see how to get input data from the user and handle type conversions.

Single and double quoted strings

The most common way to declare string literals is by enclosing a sequence of characters within single or double quotes. Unlike other scripting languages like Bash, Perl and Ruby, there is no feature difference between these forms.

REPL will again be used predominantly in this chapter. One important detail to note is that the result of an expression is displayed using the syntax of that particular data type. Use print() function when you want to see how a string literal looks visually.

>>> 'hello'
'hello'
>>> print("world")
world

If the string literal itself contains single or double quote characters, the other form can be used.

>>> print('"Will you come?" he asked.')
"Will you come?" he asked.

>>> print("it's a fine sunny day")
it's a fine sunny day

What to do if a string literal has both single and double quotes? You can use the \ character to escape the quote characters. In the below examples, \' and \" will evaluate to ' and " characters respectively, instead of prematurely terminating the string definition. Use \\ if a literal backslash character is needed.

>>> print('"It\'s so pretty!" can I get one?')
"It's so pretty!" can I get one?

>>> print("\"It's so pretty!\" can I get one?")
"It's so pretty!" can I get one?

In general, the backslash character is used to construct escape sequences. For example, \n represents the newline character, \t is for the tab character and so on. You can use \ooo and \xhh to represent 256 characters in octal and hexadecimal formats respectively. For Unicode characters, you can use \N{name}, \uxxxx and \Uxxxxxxxx formats. See docs.python: String and Bytes literals for the full list of escape sequences and details about undefined ones.

>>> greeting = 'hi there.\nhow are you?'
>>> greeting
'hi there.\nhow are you?'
>>> print(greeting)
hi there.
how are you?

>>> print('item\tquantity')
item    quantity

>>> print('\u03b1\u03bb\u03b5\N{LATIN SMALL LETTER TURNED DELTA}')
αλεƍ

Triple quoted strings

You can also declare multiline strings by enclosing the value with three single/double quote characters. If backslash is the last character of a line, then a newline won't be inserted at that position. Here's a Python program named triple_quotes.py to illustrate this concept.

# triple_quotes.py
print('''hi there.
how are you?''')

student = '''\
Name:\tlearnbyexample
Age:\t25
Dept:\tCSE'''

print(student)

Here's the output of the above script:

$ python3.13 triple_quotes.py
hi there.
how are you?
Name:   learnbyexample
Age:    25
Dept:   CSE

info See the Docstrings section for another use of triple quoted strings.

Raw strings

For certain cases, escape sequences would be too much of a hindrance to workaround. For example, filepaths in Windows use \ as the delimiter. Another would be regular expressions, where the backslash character has yet another special meaning. Python provides a raw string syntax, where all the characters are treated literally. This form, also known as r-strings for short, requires a r or R character prefix to quoted strings. Forms like triple quoted strings and raw strings are for user convenience. Internally, there's just a single representation for string literals.

>>> print(r'item\tquantity')
item\tquantity

>>> r'item\tquantity'
'item\\tquantity'
>>> r'C:\Documents\blog\monsoon_trip.txt'
'C:\\Documents\\blog\\monsoon_trip.txt'

Here's an example with the re built-in module. The import statement used below will be discussed in the Importing and creating modules chapter. See my book Understanding Python re(gex)? for details on regular expressions.

>>> import re

# numbers >= 100 with optional leading zeros
# you'd need \\b and \\d with normal strings
>>> re.findall(r'\b0*+\d{3,}\b', '0501 035 154 12 26 98234')
['0501', '154', '98234']

String operators

Python provides a wide variety of features to work with strings. This chapter introduces some of them, like the + and * operators in this section. Here are some examples to concatenate strings using the + operator. The operands can be any expression that results in a string value and you can use any of the different ways to specify a string literal.

>>> str1 = 'hello'
>>> str2 = ' world'
>>> str3 = str1 + str2
>>> print(str3)
hello world

>>> str3 + r'. 1\n2'
'hello world. 1\\n2'

Another way to concatenate is to simply place string literals next to each other. You can use zero or more whitespaces between the two literals. But you cannot mix an expression and a string literal. If the strings are inside parentheses, you can also use a newline character to separate the literals and optionally use comments.

>>> 'hello' r' 1\n2\\3'
'hello 1\\n2\\\\3'

# note that ... is REPL's indication for multiline statements, blocks, etc
>>> print('hi '
...       'there')
hi there

You can repeat a string by using the * operator between a string and an integer.

>>> style_char = '-'
>>> print(style_char * 50)
--------------------------------------------------
>>> word = 'buffalo '
>>> print(8 * word)
buffalo buffalo buffalo buffalo buffalo buffalo buffalo buffalo 

String formatting

The Zen of Python (PEP 20) states:

There should be one-- and preferably only one --obvious way to do it.

However, there are several approaches available for formatting strings. This section will first focus on formatted string literals (f-strings for short) and then show the alternate options.

f-strings allow you to embed an expression within {} characters as part of the string literal. Like raw strings, you need to use a prefix, which is f or F in this case. Python will substitute the embeds with the result of the expression, converting it to string if necessary (numeric results for example). See docs.python: Format String Syntax and docs.python: Formatted string literals for documentation and more examples.

>>> str1 = 'hello'
>>> str2 = ' world'
>>> f'{str1}{str2}'
'hello world'

>>> f'{str1}({str2 * 3})'
'hello( world world world)'

Use {{ if you need to represent { literally. Similarly, use }} to represent } literally.

>>> f'{{hello'
'{hello'
>>> f'world}}'
'world}'

Adding = after an expression gives both the expression and the result in the output.

>>> num1 = 42
>>> num2 = 7

>>> f'{num1 + num2 = }'
'num1 + num2 = 49'
>>> f'{num1 + (num2 * 10) = }'
'num1 + (num2 * 10) = 112'

Optionally, you can provide a format specifier along with the expression after a : character. These specifiers are similar to the ones provided by the printf() function in C language, the printf built-in command in Bash and so on. Here are some examples for numeric formatting.

>>> appx_pi = 22 / 7

# restricting the number of digits after the decimal point
>>> f'Approx pi: {appx_pi:.5f}'
'Approx pi: 3.14286'

# rounding is applied
>>> f'{appx_pi:.3f}'
'3.143'

# exponential notation 
>>> f'{32 ** appx_pi:.2e}'
'5.38e+04'

Here are some alignment examples:

>>> fruit = 'apple'

>>> f'{fruit:=>10}'
'=====apple'
>>> f'{fruit:=<10}'
'apple====='
>>> f'{fruit:=^10}'
'==apple==='

# default is the space character
>>> f'{fruit:^10}'
'  apple   '

You can use b, o and x to display integer values in binary, octal and hexadecimal formats respectively. Using # before these characters will add appropriate prefix for these formats.

>>> num = 42

>>> f'{num:b}'
'101010'
>>> f'{num:o}'
'52'
>>> f'{num:x}'
'2a'

>>> f'{num:#x}'
'0x2a'

The str.format() method, the format() function and the % operator are alternate approaches for string formatting.

>>> num1 = 22
>>> num2 = 7

>>> 'Output: {} / {} = {:.2f}'.format(num1, num2, num1 / num2)
'Output: 22 / 7 = 3.14'

>>> format(num1 / num2, '.2f')
'3.14'

>>> 'Approx pi: %.2f' % (num1 / num2)
'Approx pi: 3.14'

info See docs.python: The String format() Method and the sections that follow for more details about the above features. See docs.python: Format examples for more examples, including datetime formatting. The Text processing chapter will discuss more about the string processing methods.

info In case you don't know what a method is, see stackoverflow: What's the difference between a method and a function?

User input

The input() built-in function can be used to get data from the user. It also allows an optional string to make it an interactive process. This function always returns a string data type, which you can convert to another type if needed (explained in the next section).

# Python will wait until you type your text and press the Enter key
# the blinking cursor is represented by a rectangular block as shown below
>>> name = input('what is your name? ')
what is your name? █

Here's the rest of the above example.

>>> name = input('what is your name? ')
what is your name? learnbyexample

# note that newline isn't part of the value saved in the 'name' variable
>>> print(f'pleased to meet you {name}.')
pleased to meet you learnbyexample.

Type conversion

The type() built-in function can be used to know what data type you are dealing with. You can pass any expression as an argument.

>>> num = 42
>>> type(num)
<class 'int'>

>>> type(22 / 7)
<class 'float'>

>>> type('Hi there')
<class 'str'>

The built-in functions int(), float() and str() can be used to convert from one data type to another. These function names are the same as their data type class names seen above.

>>> num = 3.14
>>> int(num)
3

# you can also use f'{num}'
>>> str(num)
'3.14'

>>> usr_ip = input('enter a float value ')
enter a float value 45.24e22
>>> type(usr_ip)
<class 'str'>
>>> float(usr_ip)
4.524e+23

info See docs.python: Built-in Functions for documentation on all of the built-in functions. You can also use the help() function from the REPL as discussed in the Documentation and getting help section.

Exercises

  • Read about the Bytes literal from docs.python: String and Bytes literals. See also stackoverflow: What is the difference between a string and a byte string?
  • If you check out docs.python: int() function, you'll see that the int() function accepts an optional argument. Write a program that asks the user for hexadecimal number as input. Then, use the int() function to convert the input string to an integer (you'll need the second argument for this). Add 5 and display the result in hexadecimal format.
  • Write a program to accept two input values. First can be either a number or a string value. Second is an integer value, which should be used to display the first value in centered alignment. You can use any character you prefer to surround the value, other than the default space character.
  • What happens if you use a combination of r, f and other such valid prefix characters while declaring a string literal? For example, rf'a\{5/2}'. What happens if you use the raw strings syntax and provide only a single \ character? Does the documentation describe these cases?
  • Try out at least two format specifiers not discussed in this chapter.
  • Given a = 5, display '{5}' as the output using f-strings.