Tuple and Sequence operations

This chapter will discuss the tuple data type and some of the common sequence operations. Data types like str, range, list and tuple fall under Sequence types. Binary Sequence Types aren't discussed in this book.

Sequences and iterables

Quoting from docs.python glossary: sequence:

An iterable which supports efficient element access using integer indices via the __getitem__() special method and defines a __len__() method that returns the length of the sequence. Some built-in sequence types are list, str, tuple, and bytes. Note that dict also supports __getitem__() and __len__(), but is considered a mapping rather than a sequence because the lookups use arbitrary immutable keys rather than integers.

Partial quote from docs.python glossary: iterable:

An object capable of returning its members one at a time. Examples of iterables include all sequence types (such as list, str, and tuple) and some non-sequence types like dict, file objects...

Some of the operations behave differently or do not apply for certain types, see docs.python: Common Sequence Operations for details.

Initialization

Tuples are declared as a collection of zero or more objects, separated by a comma within () parentheses characters. Each element can be specified as a value by itself or as an expression. The outer parentheses are optional if comma separation is present. Here's some examples:

# can also use: empty_tuple = tuple()
>>> empty_tuple = ()

# note the trailing comma, otherwise it will result in a 'str' data type
# same as 'apple', since parentheses are optional here
>>> one_element = ('apple',)

# multiple elements example
>>> dishes = ('Aloo tikki', 'Baati', 'Khichdi', 'Makki roti', 'Poha')

# mixed data type example, uses expressions as well
>>> mixed = (1+2, 'two', (-3, -4), empty_tuple)
>>> mixed
(3, 'two', (-3, -4), ())

You can use the tuple() built-in function to create a tuple from an iterable (described in the previous section).

>>> chars = tuple('hello')
>>> chars
('h', 'e', 'l', 'l', 'o')
>>> tuple(range(3, 10, 3))
(3, 6, 9)

info Tuples are immutable, but individual elements can be either mutable or immutable. As an exercise, given chars = tuple('hello'), check the output of the expression chars[0] and the statement chars[0] = 'H'.

Slicing

One or more elements can be retrieved from a sequence using the slicing notation (this wouldn't work for an iterable like dict or set). It works similarly to the start/stop/step logic seen with the range() function. The default step is 1. Default value for start and stop depends on whether the step is positive or negative.

>>> primes = (2, 3, 5, 7, 11)

# index starts with 0
>>> primes[0]
2

# start=2 and stop=4, default step=1
# note that the element at index 4 (stop value) isn't part of the output
>>> primes[2:4]
(5, 7)
# default start=0
>>> primes[:3]
(2, 3, 5)
# default stop=len(seq) for positive values of step
>>> primes[3:]
(7, 11)

# shallow copy of the sequence, same as primes[::1]
>>> primes[:]
(2, 3, 5, 7, 11)

You can use negative index to get elements from the end of the sequence. This is especially helpful when you don't know the size of the sequence. Given a positive integer n greater than zero, the expression seq[-n] is evaluated as seq[len(seq) - n].

>>> primes = (2, 3, 5, 7, 11)

# len(primes) - 1 = 4, so this is same as primes[4]
>>> primes[-1]
11

# seq[-n:] will give the last n elements
>>> primes[-1:]
(11,)
>>> primes[-2:]
(7, 11)

Here's some examples with different step values.

>>> primes = (2, 3, 5, 7, 11)

# same as primes[0:5:2]
>>> primes[::2]
(2, 5, 11)

# retrieve elements in reverse direction
# note that the element at index 1 (stop value) isn't part of the output
>>> primes[3:1:-1]
(7, 5)
# reversed sequence
# would help you with the palindrome exercise from Control structures chapter
>>> primes[::-1]
(11, 7, 5, 3, 2)

As an exercise, given primes = (2, 3, 5, 7, 11),

  • what happens if you use primes[5] or primes[-6]?
  • what happens if you use primes[:5] or primes[-6:]?
  • is it possible to get the same output as primes[::-1] by using an explicit number for the stop value? If not, why not?

Sequence unpacking

You can assign the individual elements of an iterable to multiple variables. This is known as sequence unpacking and it is handy in many situations.

>>> details = ('2018-10-25', 'car', 2346)
>>> purchase_date, vehicle, qty = details
>>> purchase_date
'2018-10-25'
>>> vehicle
'car'
>>> qty
2346

Here's how you can easily swap variable values.

>>> num1 = 3.14
>>> num2 = 42
>>> num3 = -100

# RHS is a single tuple object (recall that parentheses are optional)
>>> num1, num2, num3 = num3, num1, num2
>>> print(f'{num1 = }; {num2 = }; {num3 = }')
num1 = -100; num2 = 3.14; num3 = 42

Unpacking isn't limited to single value assignments. You can use a * prefix to assign all the remaining values, if any is left, to a list variable.

>>> values = ('first', 6.2, -3, 500, 'last')

>>> x, *y = values
>>> x
'first'
>>> y
[6.2, -3, 500, 'last']

>>> a, *b, c = values
>>> a
'first'
>>> b
[6.2, -3, 500]
>>> c
'last'

As an exercise, what do you think will happen for these cases, given nums = (1, 2):

  • a, b, c = nums
  • a, *b, c = nums
  • *a, *b = nums

Returning multiple values

Tuples are also the preferred way to return multiple values from a function. Here's some examples:

>>> def min_max(iterable):
...     return min(iterable), max(iterable)
... 
>>> min_max('visualization')
('a', 'z')
>>> small, big = min_max((10, -42, 53.2, -3))
>>> small
-42
>>> big
53.2

The min_max(iterable) user-defined function in the above snippet returns both minimum and maximum values of a given iterable input. min() and max() are built-in functions. You can either save the output as a tuple or unpack into multiple variables. You'll see built-in functions that return tuple as output later in this chapter.

warning The use of both min() and max() in the above example is for illustration purpose only. As an exercise, write a custom logic that iterates only once over the input sequence and calculates both minimum/maximum simultaneously.

Iteration

You have already seen examples with for loop that iterates over a sequence data type. Here's a refresher:

>>> nums = (3, 6, 9)
>>> for n in nums:
...     print(f'square of {n} is {n ** 2}')
... 
square of 3 is 9
square of 6 is 36
square of 9 is 81

In the above example, you get one element per each iteration. If you need the index of the elements as well, you can use the enumerate() built-in function. You'll get a tuple value per each iteration, containing index (starting with 0 by default) and the value at that index. Here's some examples:

>>> nums = (42, 3.14, -2)
>>> for t in enumerate(nums):
...     print(t)
... 
(0, 42)
(1, 3.14)
(2, -2)
>>> for idx, val in enumerate(nums):
...     print(f'{idx}: {val:>5}')
... 
0:    42
1:  3.14
2:    -2

info The enumerate() built-in function has a start=0 default valued argument. As an exercise, change the above snippet to start the index from 1 instead of 0.

Arbitrary number of arguments

As seen before, the print() function can accept zero or more values separated by a comma. Here's a portion of the documentation as a refresher:

print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)

You can write your own functions to accept arbitrary number of arguments as well. The packing syntax is similar to the sequence unpacking examples seen earlier in the chapter. A * prefix to an argument name will allow it to accept zero or more values. Such an argument will be packed as a tuple data type and it should always be specified after positional arguments (if any). Idiomatically, args is used as the variable name. Here's an example:

>>> def many(a, *args):
...     print(f'{a = }; {args = }')
... 
>>> many()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: many() missing 1 required positional argument: 'a'
>>> many(1)
a = 1; args = ()
>>> many(1, 'two', 3)
a = 1; args = ('two', 3)

Here's a more practical example:

>>> def sum_nums(*args):
...     total = 0
...     for n in args:
...         total += n
...     return total
... 
>>> sum_nums()
0
>>> sum_nums(3, -8)
-5
>>> sum_nums(1, 2, 3, 4, 5)
15

As an exercise,

  • add a default valued argument initial which should be used to initialize total instead of 0 in the above sum_nums() function. For example, sum_nums(3, -8) should give -5 and sum_nums(1, 2, 3, 4, 5, initial=5) should give 20.
  • what would happen if you call the above function like sum_nums(initial=5, 2)?
  • what would happen if you have nums = (1, 2) and call the above function like sum_nums(*nums, initial=3)?
  • in what ways does this function differ from the sum() built-in function?

info See also docs.python: Arbitrary Argument Lists.

info Section Arbitrary keyword arguments in a later chapter will discuss how to define functions that accept arbitrary number of keyword arguments.

zip

Use zip() to iterate over two or more iterables simultaneously. Every iteration, you'll get a tuple with an item from each of the iterables. Iteration will stop when any of the iterables is exhausted. See itertools.zip_longest() and stackoverflow: Zipped Python generators with 2nd one being shorter for alternatives.

Here's an example:

>>> odd = (1, 3, 5)
>>> even = (2, 4, 6)
>>> for i, j in zip(odd, even):
...     print(i + j)
... 
3
7
11

As an exercise, write a function that returns the sum of product of corresponding elements of two sequences. For example, the result should be 44 for (1, 3, 5) and (2, 4, 6).

Tuple methods

While this book won't discuss Object-Oriented Programming (OOP) in any detail, you'll still see plenty examples for using them. You've already seen a few examples with modules. See Practical Python Programming and Fluent Python if you want to learn about Python OOP in depth. See also docs.python: Data model.

Data types in Python are all internally implemented as classes. You can use the dir() built-in function to get a list of valid attributes for an object.

# you can also use tuple objects such as 'odd' and 'even' declared earlier
>>> dir(tuple)
['__add__', '__class__', '__class_getitem__', '__contains__', '__delattr__',
 '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__',
 '__getitem__', '__getnewargs__', '__gt__', '__hash__', '__init__',
 '__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__mul__',
 '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rmul__',
 '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'count', 'index']

>>> even = (2, 4, 6)
# same as: len(even)
>>> even.__len__()
3

The non-dunder names (last two items) in the above listing will be discussed in this section. But first, a refresher on the in membership operator is shown below.

>>> num = 5
>>> num in (10, 21, 33)
False

>>> num = 21
>>> num in (10, 21, 33)
True

The count() method returns the number of times a value is present in the tuple object.

>>> nums = (1, 4, 6, 22, 3, 5, 2, 1, 51, 3, 1)
>>> nums.count(3)
2
>>> nums.count(31)
0

The index() method will give the index of the first occurrence of a value. It will raise ValueError if the value isn't present, which you can avoid by using the in operator first. Or, you can use the try-except statement to handle the exception as needed.

>>> nums = (1, 4, 6, 22, 3, 5, 2, 1, 51, 3, 1)

>>> nums.index(3)
4

>>> n = 31
>>> nums.index(n)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: tuple.index(x): x not in tuple
>>> if n in nums:
...     print(nums.index(n))
... else:
...     print(f'{n} not present in "nums" tuple')
... 
31 not present in "nums" tuple

info The list and str sequence types have many more methods and they will be discussed separately in later chapters.

Specialized container datatypes

  • docs.python: collections — alternatives to Python’s general purpose built-in containers, dict, list, set, and tuple
  • docs.python: array — compactly represent an array of basic values: characters, integers, floating point numbers
  • boltons — pure-Python utilities which extend the Python standard library