Set

set is a mutable, unordered collection of hashable objects. frozenset is similar to set, but immutable. See docs.python: set, frozenset for documentation.

Initialization

Sets are declared as a collection of objects separated by a comma within {} characters. The set() function can be used to initialize an empty set and to convert iterables.

>>> empty_set = set()
>>> empty_set
set()

>>> nums = {-0.1, 3, 2, -5, 7, 1, 6.3, 5}
# note that the order is not the same as declaration
>>> nums
{-0.1, 1, 2, 3, 5, 6.3, 7, -5}

# sets can only contain distinct elements
>>> set([3, 2, 11, 3, 5, 13, 2])
{2, 3, 5, 11, 13}
>>> set('initialize')
{'a', 'n', 't', 'l', 'e', 'i', 'z'}

set doesn't allow mutable objects as elements.

>>> {1, 3, [1, 2], 4}
Traceback (most recent call last):
  File "<python-input-0>", line 1, in <module>
    {1, 3, [1, 2], 4}
TypeError: unhashable type: 'list'

>>> {1, 3, (1, 2), 4}
{3, 1, (1, 2), 4}

Set methods and operations

The in operator checks if a value is present in the given set. Since set uses a hashtable (similar to dict keys), the lookup time is constant and much faster than ordered collections like list or tuple for large containers.

>>> colors = {'red', 'blue', 'green'}
>>> 'blue' in colors
True
>>> 'orange' in colors
False

Here are some examples for set operations like union, intersection, etc. You can either use methods or operators, both will give you a new set object instead of in-place modification. The difference is that the set methods can accept any iterable, whereas the operators can work only with set or set-like objects.

>>> color_1 = {'teal', 'light blue', 'green', 'yellow'}
>>> color_2 = {'light blue', 'black', 'dark green', 'yellow'}

# union of two sets: color_1 | color_2
>>> color_1.union(color_2)
{'light blue', 'green', 'dark green', 'black', 'teal', 'yellow'}

# common items: color_1 & color_2
>>> color_1.intersection(color_2)
{'light blue', 'yellow'}

# items from color_1 not present in color_2: color_1 - color_2
>>> color_1.difference(color_2)
{'teal', 'green'}
# items from color_2 not present in color_1: color_2 - color_1
>>> color_2.difference(color_1)
{'dark green', 'black'}

# items present in one of the sets, but not both
# i.e. union of previous two operations: color_1 ^ color_2
>>> color_1.symmetric_difference(color_2)
{'green', 'dark green', 'black', 'teal'}

As mentioned in the Dict chapter, methods like keys(), values() and items() return a set-like object. You can apply set operators on them.

>>> marks_1 = dict(Rahul=86, Ravi=92, Rohit=75)
>>> marks_2 = dict(Jo=89, Rohit=78, Joe=75, Ravi=100)

>>> marks_1.keys() & marks_2.keys()
{'Ravi', 'Rohit'}
>>> marks_1.keys() - marks_2.keys()
{'Rahul'}

Methods like add(), update(), symmetric_difference_update(), intersection_update() and difference_update() will do the modifications in-place.

>>> color_1 = {'teal', 'light blue', 'green', 'yellow'}
>>> color_2 = {'light blue', 'black', 'dark green', 'yellow'}

# union
>>> color_1.update(color_2)
>>> color_1
{'light blue', 'green', 'dark green', 'black', 'teal', 'yellow'}

# adding a single value
>>> color_2.add('orange')
>>> color_2
{'black', 'yellow', 'dark green', 'light blue', 'orange'}

The pop() method will return a random element being removed. Use the remove() method if you want to delete an element based on its value. The discard() method is similar to remove(), but it will not generate an error if the element doesn't exist. The clear() method will delete all the elements.

>>> colors = {'red', 'blue', 'green'}

>>> colors.pop()
'blue'
>>> colors
{'green', 'red'}

# you'll get KeyError if you use the 'remove()' method here
>>> colors.discard('black')

>>> colors.clear()
>>> colors
set()

Here are some examples for comparison operations.

>>> names_1 = {'Ravi', 'Rohit'}
>>> names_2 = {'Ravi', 'Ram', 'Rohit', 'Raj'}

>>> names_1 == names_2
False

# same as: names_1 <= names_2
>>> names_1.issubset(names_2)
True

# same as: names_2 >= names_1
>>> names_2.issuperset(names_1)
True

# disjoint checks if there are no common elements
# same as: not names_1 & names_2
>>> names_1.isdisjoint(names_2)
False
>>> names_1.isdisjoint({'Jo', 'Joe'})
True

Exercises

  • Write a function that checks whether an iterable has duplicate values or not.

    >>> has_duplicates('pip')
    True
    >>> has_duplicates((3, 2))
    False
    
  • What does the above function return for has_duplicates([3, 2, 3.0])?