Importing and creating modules

The previous chapters focused on data types, functions (both built-in and user defined) and control structures. This chapter will show how to use built-in as well as user defined modules. Quoting from docs.python: Modules:

A module is a file containing Python definitions and statements. The file name is the module name with the suffix .py appended.

Random numbers

Say you want to generate a random number from a given range for a guessing game. You could write your own random number generator. Or, you could save development/testing time, and make use of the random built-in module.

Here's an example guessing game.

# rand_number.py
import random

# gives back an integer between 0 and 10 (inclusive)
rand_int = random.randrange(11)

print('I have thought of a number between 0 and 10.')
print('Can you guess it within 4 attempts?\n')
for _ in range(4):
    guess = int(input('Enter your guess: '))
    if guess == rand_int:
        print('Wow, you guessed it right!')
        break
    elif guess > rand_int:
        print('Oops, your guess is too high.')
    else:
        print('Oops, your guess is too low.')
else:
    print('\nOh no! You are out of chances. Better luck next time.')

import random will load this built-in module for use in this script, you'll see more details about import later in this chapter. The randrange() method follows the same start/stop/step logic as the range() function and returns a random integer from the given range. The for loop is used here to get the user input for a maximum of 4 attempts. The loop body doesn't need to know the current iteration count. In such cases, _ is used to indicate a throwaway variable name.

As mentioned in the previous chapter, else clause is supported by loops too. It is used to execute code if the loop is completed normally. If the user correctly guesses the random number, break will be executed, which is not a normal loop completion. In that case, the else clause will not be executed.

A sample run with correct guess is shown below.

$ python3.9 rand_number.py
I have thought of a number between 0 and 10
Can you guess it within 4 attempts?

Enter your guess: 5
Oops, your guess is too low.
Enter your guess: 8
Oops, your guess is too high.
Enter your guess: 6
Wow, you guessed it right!

Here's a failed guess.

$ python3.9 rand_number.py
I have thought of a number between 0 and 10.
Can you guess it within 4 attempts?

Enter your guess: 1
Oops, your guess is too low.
Enter your guess: 2
Oops, your guess is too low.
Enter your guess: 3
Oops, your guess is too low.
Enter your guess: 4
Oops, your guess is too low.

Oh no! You are out of chances. Better luck next time.

Importing your own module

All the programs presented so far can be used as a module as it is without making further changes. However, that'll lead to some unwanted behavior. This section will discuss these issues and the next section will show how to resolve them.

# num_funcs.py
def sqr(n):
    return n * n

def fact(n):
    total = 1
    for i in range(2, n+1):
        total *= i
    return total

num = 5
print(f'square of {num} is {sqr(num)}')
print(f'factorial of {num} is {fact(num)}')

The above program defines two functions, one variable and calls the print() function twice. After you've written this program, open an interactive shell from the same directory. Then, load the module using import num_funcs where num_funcs is the name of the program without the .py extension.

>>> import num_funcs
square of 5 is 25
factorial of 5 is 120

So what happened here? Not only did the sqr and fact functions get imported, the code outside of these functions got executed as well. That isn't what you'd expect on loading a module. Next section will show how to prevent this behavior. For now, continue the REPL session.

>>> num_funcs.sqr(12)
144
>>> num_funcs.fact(0)
1
>>> num_funcs.num
5

As an exercise,

  • add docstrings for the above program and check the output of help() function using num_funcs, num_funcs.fact, etc as arguments.
  • check what would be the output of num_funcs.fact() for negative integers and floating-point numbers. Then import the math built-in module and repeat the process with math.factorial(). Go through the Exception handling chapter and modify the above program to gracefully handle negative integers and floating-point numbers.

How does Python know where a module is located? Quoting from docs.python: The Module Search Path:

When a module named spam is imported, the interpreter first searches for a built-in module with that name. If not found, it then searches for a file named spam.py in a list of directories given by the variable sys.path. sys.path is initialized from these locations:

• The directory containing the input script (or the current directory when no file is specified).

• PYTHONPATH (a list of directory names, with the same syntax as the shell variable PATH).

• The installation-dependent default.

__name__ special variable

The special variable __name__ will be assigned the string value '__main__' only for the program file that is executed. The files that are imported inside another program will see their own filename (without the extension) as the value for the __name__ variable. This behavior allows you to code a program that can act as a module as well as execute extra code if and only if it is run as the main program.

Here's an example to illustrate this behavior.

# num_funcs_module.py
def sqr(n):
    return n * n

def fact(n):
    total = 1
    for i in range(2, n+1):
        total *= i
    return total

if __name__ == '__main__':
    num = 5
    print(f'square of {num} is {sqr(num)}')
    print(f'factorial of {num} is {fact(num)}')

When you run the above program as a standalone application, the if condition will get evaluated to True.

$ python3.9 num_funcs_module.py
square of 5 is 25
factorial of 5 is 120

On importing, the above if condition will evaluate to False as num_funcs_module.py is no longer the main program. In the below example, the REPL session is the main program.

>>> __name__
'__main__'
>>> import num_funcs_module
>>> num_funcs_module.sqr(12)
144
>>> num_funcs_module.fact(0)
1

# 'num' variable inside the 'if' block is no longer accessible
>>> num_funcs_module.num
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: module 'num_funcs_module' has no attribute 'num'

info In the above example, there are three statements that'll be executed if the program is run as the main program. It is common to put such statements under a main() user defined function and then call it inside the if block.

info There are many such special variables and methods with double underscores around their names. They are also called as dunder variables and methods. See stackoverflow: __name__ special variable for a detailed discussion and strange use cases.

Different ways of importing

When you use import <module> statement, you'll have to prefix the module name whenever you need to use its features. If this becomes cumbersome, you can use alternate ways of importing.

First up, removing the prefix altogether as shown below. This will load all names from the module except those beginning with a _ character. Use this feature only if needed, one of the other alternatives might suit better.

>>> from math import *
>>> sin(radians(90))
1.0
>>> pi
3.141592653589793

Instead of using *, a comma separated list of names is usually enough.

>>> from random import randrange
>>> randrange(3, 10, 2)
9

>>> from math import cos, pi
>>> cos(pi)
-1.0

You can also alias the name being imported using the as keyword. You can specify multiple aliases with comma separation.

>>> import random as rd
>>> rd.randrange(4)
1

>>> from math import factorial as fact
>>> fact(10)
3628800

__pycache__ directory

If you notice the __pycache__ directory after you import your own module, don't panic. Quoting from docs.python: Compiled Python files:

To speed up loading modules, Python caches the compiled version of each module in the __pycache__ directory under the name module.version.pyc, where the version encodes the format of the compiled file; it generally contains the Python version number. For example, in CPython release 3.3 the compiled version of spam.py would be cached as __pycache__/spam.cpython-33.pyc. This naming convention allows compiled modules from different releases and different versions of Python to coexist.

You can use python3.9 -B if you do not wish the __pycache__ directory to be created.

Explore modules