Tuple and Sequence operations
This chapter will discuss the tuple
data type and some of the common sequence operations. Data types like str
, range
, list
and tuple
fall under Sequence types. Binary Sequence Types aren't discussed in this book.
Sequences and iterables
Quoting from docs.python glossary: sequence:
An iterable which supports efficient element access using integer indices via the
__getitem__()
special method and defines a__len__()
method that returns the length of the sequence. Some built-in sequence types are list, str, tuple, and bytes. Note that dict also supports__getitem__()
and__len__()
, but is considered a mapping rather than a sequence because the lookups use arbitrary immutable keys rather than integers.
Partial quote from docs.python glossary: iterable:
An object capable of returning its members one at a time. Examples of iterables include all sequence types (such as list, str, and tuple) and some non-sequence types like dict, file objects...
Some of the operations behave differently or do not apply for certain types, see docs.python: Common Sequence Operations for details.
Initialization
Tuples are declared as a collection of zero or more objects, separated by a comma within ()
parentheses characters. Each element can be specified as a value by itself or as an expression. The outer parentheses are optional if comma separation is present. Here's some examples:
# can also use: empty_tuple = tuple()
>>> empty_tuple = ()
# note the trailing comma, otherwise it will result in a 'str' data type
# same as 'apple', since parentheses are optional here
>>> one_element = ('apple',)
# multiple elements example
>>> dishes = ('Aloo tikki', 'Baati', 'Khichdi', 'Makki roti', 'Poha')
# mixed data type example, uses expressions as well
>>> mixed = (1+2, 'two', (-3, -4), empty_tuple)
>>> mixed
(3, 'two', (-3, -4), ())
You can use the tuple() built-in function to create a tuple
from an iterable (described in the previous section).
>>> chars = tuple('hello')
>>> chars
('h', 'e', 'l', 'l', 'o')
>>> tuple(range(3, 10, 3))
(3, 6, 9)
Tuples are immutable, but individual elements can be either mutable or immutable. As an exercise, given
chars = tuple('hello')
, check the output of the expressionchars[0]
and the statementchars[0] = 'H'
.
Slicing
One or more elements can be retrieved from a sequence using the slicing notation (this wouldn't work for an iterable like dict
or set
). It works similarly to the start/stop/step
logic seen with the range()
function. The default step
is 1
. Default value for start
and stop
depends on whether the step
is positive or negative.
>>> primes = (2, 3, 5, 7, 11)
# index starts with 0
>>> primes[0]
2
# start=2 and stop=4, default step=1
# note that the element at index 4 (stop value) isn't part of the output
>>> primes[2:4]
(5, 7)
# default start=0
>>> primes[:3]
(2, 3, 5)
# default stop=len(seq) for positive values of step
>>> primes[3:]
(7, 11)
# shallow copy of the sequence, same as primes[::1]
>>> primes[:]
(2, 3, 5, 7, 11)
You can use negative index to get elements from the end of the sequence. This is especially helpful when you don't know the size of the sequence. Given a positive integer n
greater than zero, the expression seq[-n]
is evaluated as seq[len(seq) - n]
.
>>> primes = (2, 3, 5, 7, 11)
# len(primes) - 1 = 4, so this is same as primes[4]
>>> primes[-1]
11
# seq[-n:] will give the last n elements
>>> primes[-1:]
(11,)
>>> primes[-2:]
(7, 11)
Here's some examples with different step
values.
>>> primes = (2, 3, 5, 7, 11)
# same as primes[0:5:2]
>>> primes[::2]
(2, 5, 11)
# retrieve elements in reverse direction
# note that the element at index 1 (stop value) isn't part of the output
>>> primes[3:1:-1]
(7, 5)
# reversed sequence
# would help you with the palindrome exercise from Control structures chapter
>>> primes[::-1]
(11, 7, 5, 3, 2)
As an exercise, given primes = (2, 3, 5, 7, 11)
,
- what happens if you use
primes[5]
orprimes[-6]
? - what happens if you use
primes[:5]
orprimes[-6:]
? - is it possible to get the same output as
primes[::-1]
by using an explicit number for thestop
value? If not, why not?
Sequence unpacking
You can assign the individual elements of an iterable to multiple variables. This is known as sequence unpacking and it is handy in many situations.
>>> details = ('2018-10-25', 'car', 2346)
>>> purchase_date, vehicle, qty = details
>>> purchase_date
'2018-10-25'
>>> vehicle
'car'
>>> qty
2346
Here's how you can easily swap variable values.
>>> num1 = 3.14
>>> num2 = 42
>>> num3 = -100
# RHS is a single tuple object (recall that parentheses are optional)
>>> num1, num2, num3 = num3, num1, num2
>>> print(f'{num1 = }; {num2 = }; {num3 = }')
num1 = -100; num2 = 3.14; num3 = 42
Unpacking isn't limited to single value assignments. You can use a *
prefix to assign all the remaining values, if any is left, to a list
variable.
>>> values = ('first', 6.2, -3, 500, 'last')
>>> x, *y = values
>>> x
'first'
>>> y
[6.2, -3, 500, 'last']
>>> a, *b, c = values
>>> a
'first'
>>> b
[6.2, -3, 500]
>>> c
'last'
As an exercise, what do you think will happen for these cases, given nums = (1, 2)
:
a, b, c = nums
a, *b, c = nums
*a, *b = nums
Returning multiple values
Tuples are also the preferred way to return multiple values from a function. Here's some examples:
>>> def min_max(iterable):
... return min(iterable), max(iterable)
...
>>> min_max('visualization')
('a', 'z')
>>> small, big = min_max((10, -42, 53.2, -3))
>>> small
-42
>>> big
53.2
The min_max(iterable)
user-defined function in the above snippet returns both minimum and maximum values of a given iterable input. min()
and max()
are built-in functions. You can either save the output as a tuple
or unpack into multiple variables. You'll see built-in functions that return tuple
as output later in this chapter.
The use of both min() and max() in the above example is for illustration purpose only. As an exercise, write a custom logic that iterates only once over the input sequence and calculates both minimum/maximum simultaneously.
Iteration
You have already seen examples with for
loop that iterates over a sequence data type. Here's a refresher:
>>> nums = (3, 6, 9)
>>> for n in nums:
... print(f'square of {n} is {n ** 2}')
...
square of 3 is 9
square of 6 is 36
square of 9 is 81
In the above example, you get one element per each iteration. If you need the index of the elements as well, you can use the enumerate() built-in function. You'll get a tuple
value per each iteration, containing index (starting with 0
by default) and the value at that index. Here's some examples:
>>> nums = (42, 3.14, -2)
>>> for t in enumerate(nums):
... print(t)
...
(0, 42)
(1, 3.14)
(2, -2)
>>> for idx, val in enumerate(nums):
... print(f'{idx}: {val:>5}')
...
0: 42
1: 3.14
2: -2
The
enumerate()
built-in function has astart=0
default valued argument. As an exercise, change the above snippet to start the index from1
instead of0
.
Arbitrary number of arguments
As seen before, the print()
function can accept zero or more values separated by a comma. Here's a portion of the documentation as a refresher:
print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)
You can write your own functions to accept arbitrary number of arguments as well. The packing syntax is similar to the sequence unpacking examples seen earlier in the chapter. A *
prefix to an argument name will allow it to accept zero or more values. Such an argument will be packed as a tuple
data type and it should always be specified after positional arguments (if any). Idiomatically, args
is used as the variable name. Here's an example:
>>> def many(a, *args):
... print(f'{a = }; {args = }')
...
>>> many()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: many() missing 1 required positional argument: 'a'
>>> many(1)
a = 1; args = ()
>>> many(1, 'two', 3)
a = 1; args = ('two', 3)
Here's a more practical example:
>>> def sum_nums(*args):
... total = 0
... for n in args:
... total += n
... return total
...
>>> sum_nums()
0
>>> sum_nums(3, -8)
-5
>>> sum_nums(1, 2, 3, 4, 5)
15
As an exercise,
- add a default valued argument
initial
which should be used to initializetotal
instead of0
in the abovesum_nums()
function. For example,sum_nums(3, -8)
should give-5
andsum_nums(1, 2, 3, 4, 5, initial=5)
should give20
. - what would happen if you call the above function like
sum_nums(initial=5, 2)
? - what would happen if you have
nums = (1, 2)
and call the above function likesum_nums(*nums, initial=3)
? - in what ways does this function differ from the sum() built-in function?
See also docs.python: Arbitrary Argument Lists.
Section Arbitrary keyword arguments in a later chapter will discuss how to define functions that accept arbitrary number of keyword arguments.
zip
Use zip() to iterate over two or more iterables simultaneously. Every iteration, you'll get a tuple
with an item from each of the iterables. Iteration will stop when any of the iterables is exhausted. See itertools.zip_longest() and stackoverflow: Zipped Python generators with 2nd one being shorter for alternatives.
Here's an example:
>>> odd = (1, 3, 5)
>>> even = (2, 4, 6)
>>> for i, j in zip(odd, even):
... print(i + j)
...
3
7
11
As an exercise, write a function that returns the sum of product of corresponding elements of two sequences. For example, the result should be 44
for (1, 3, 5)
and (2, 4, 6)
.
Tuple methods
While this book won't discuss Object-Oriented Programming (OOP) in any detail, you'll still see plenty examples for using them. You've already seen a few examples with modules. See Practical Python Programming and Fluent Python if you want to learn about Python OOP in depth. See also docs.python: Data model.
Data types in Python are all internally implemented as classes. You can use the dir() built-in function to get a list of valid attributes for an object.
# you can also use tuple objects such as 'odd' and 'even' declared earlier
>>> dir(tuple)
['__add__', '__class__', '__class_getitem__', '__contains__', '__delattr__',
'__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__',
'__getitem__', '__getnewargs__', '__gt__', '__hash__', '__init__',
'__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__mul__',
'__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rmul__',
'__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'count', 'index']
>>> even = (2, 4, 6)
# same as: len(even)
>>> even.__len__()
3
The non-dunder names (last two items) in the above listing will be discussed in this section. But first, a refresher on the in
membership operator is shown below.
>>> num = 5
>>> num in (10, 21, 33)
False
>>> num = 21
>>> num in (10, 21, 33)
True
The count()
method returns the number of times a value is present in the tuple
object.
>>> nums = (1, 4, 6, 22, 3, 5, 2, 1, 51, 3, 1)
>>> nums.count(3)
2
>>> nums.count(31)
0
The index()
method will give the index of the first occurrence of a value. It will raise ValueError
if the value isn't present, which you can avoid by using the in
operator first. Or, you can use the try-except
statement to handle the exception as needed.
>>> nums = (1, 4, 6, 22, 3, 5, 2, 1, 51, 3, 1)
>>> nums.index(3)
4
>>> n = 31
>>> nums.index(n)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: tuple.index(x): x not in tuple
>>> if n in nums:
... print(nums.index(n))
... else:
... print(f'{n} not present in "nums" tuple')
...
31 not present in "nums" tuple
The
list
andstr
sequence types have many more methods and they will be discussed separately in later chapters.
Specialized container datatypes
- docs.python: collections — alternatives to Python’s general purpose built-in containers,
dict
,list
,set
, andtuple
- docs.python: array — compactly represent an array of basic values: characters, integers, floating point numbers
- boltons — pure-Python utilities which extend the Python standard library