Introduction To Python: Part 2

In this lecture we’ll get to know more about some of the core constructs of Python programming:

  • Functions
  • Conditionals and Booleans
  • Simple Looping and Containers
  • Modules and Packages

Functions

Previously, you’ve been introduced to writing simple functions. You’ve learned about the def (py3) statement. You’ve learned about specifying parameters when defining a function. And about passing arguments when you call a function. And you’ve learned that in a simple function like the one below, the symbols x, y, and z are local names.

def fun(x, y):
    z = x + y
    return z

But what does local really mean in Python?

Local vs. Global

Symbols bound in Python have a scope. That scope determines where a symbol is visible, or what value it has in a given block. Consider this example code (try it out in your own interpreter):

In [14]: x = 32
In [15]: y = 33
In [16]: z = 34
In [17]: def fun(y, z):
   ....:     print(x, y, z)
   ....:
In [18]: fun(3, 4)
32 3 4

Notice that the value printed for x comes from outside the function, even though the symbol is used inside the function. This is a global name. Conversely, even though there are a y and z defined globally, the value used for them is local to the function. But did that change the value of y and z in the global scope?

But, did the value of y and z change in the global scope?

In [19]: y
Out[19]: 33

In [20]: z
Out[20]: 34

Names in local scope mask names bound in the global scope. They are really different names in a different place. Binding different values to them does not change the binding of the name in the global scope.

In Python, you should use global bindings mostly for constants (values that are meant to be used everywhere and are not changed). It is conventional in Python to designate global constants by typing the symbols we bind to them in ALL_CAPS:

INSTALLED_APPS = [u'foo', u'bar', u'baz']
CONFIGURATION_KEY = u'some secret value'
...

Again, this is just a convention, but it’s a good one to follow. It helps you to keep straight what symbols are bound in the global scope.

There’s a trap in this interplay of global and local names. Take a look at this function definition:

In [21]: x = 3

In [22]: def f():
   ....:     y = x
   ....:     x = 5
   ....:     print(x)
   ....:     print(y)
   ....:

What is going to happen when we call f? The Zen of Python tells us “In the face of ambiguity, refuse the temptation to guess.” So try it out and see:

In [23]: f()
---------------------------------------------------------------------------
UnboundLocalError                         Traceback (most recent call last)
<ipython-input-23-0ec059b9bfe1> in <module>()
----> 1 f()

<ipython-input-22-9225fa53a20a> in f()
      1 def f():
----> 2     y = x
      3     x = 5
      4     print(x)
      5     print(y)

UnboundLocalError: local variable 'x' referenced before assignment

The symbol x is going to be bound locally. Because of this it becomes a local name and masks the fact that a global name with a bound value already exists. This causes the UnboundLocalError.

This is another example of why it’s a good idea to keep your global names as ALL_CAPS. It makes it easier to avoid this type of mistake.

Parameters

So far we’ve seen simple parameter lists:

def fun(x, y, z):
    print(x, y, z)

These types of parameters are called positional. When you call a function, you must provide arguments for all positional parameters in the order they are listed.

You can provide default values for parameters in a function definition. When parameters are given with default values, they become optional.

In [24]: def fun(x=1, y=2, z=3):
   ....:     print(x, y, z)
   ....:

In [25]: fun()
1 2 3

When you have optional parameters, you can still provide arguments to a function call positionally. But you have to start with the first one. You can also use the parameter name as a keyword to indicate which you mean. This is called a keyword argument to set them apart from just-plain arguments

In [26]: fun(6)
6 2 3
In [27]: fun(6, 7)
6 7 3
In [28]: fun(6, 7, 8)
6 7 8

In [29]: fun(y=4, x=1)
1 4 3

Once you’ve provided a keyword argument to a function call, you can no longer provide any positional arguments:

In [30]: fun(x=5, 6)
  File "<ipython-input-30-4529e5befb95>", line 1
    fun(x=5, 6)
SyntaxError: non-keyword arg after keyword arg

You do not have to use only one style or the other when writing functions. You can use both positional and optional parameters. But any positional parameters must come before any optional parameters.

def mixed(a, b, c='maybe'):
    print(a, b, c)

This brings us to a fun feature of Python function definitions. You can define a parameter list that requires an unspecified number of positional or optional parameters. The key is the * (splat) or ** (double-splat) operator:

In [31]: def fun(*args, **kwargs):
   ....:     print(args, kwargs)
   ....:
In [32]: fun(1)
(1,) {}
In [33]: fun(1, 2, zombies=u"brains")
(1, 2) {'zombies': u'brains'}
In [34]: fun(1, 2, 3, zombies=u"brains", vampires=u"blood")
(1, 2, 3) {'vampires': u'blood', 'zombies': u'brains'}

By convention, use args and kwargs for this style of parameters.

Documentation

It’s often helpful to leave information in your code about what you were thinking when you wrote it. This can help reduce the number of WTFs per minute in reading it later. In Python, we have two approaches to this, comments and docstrings.

Comments

Comments go inline in the body of your code, to explain reasoning:

if (frobnaglers > whozits):
    # borangas are shermed to ensure frobnagler population
    # does not grow out of control
    sherm_the_boranga()

You can use them to mark places you want to revisit later:

for partygoer in partygoers:
    for balloon in balloons:
        for cupcake in cupcakes:
            # TODO: Reduce time complexity here.  It's killing us
            #  for large parties.
            resolve_party_favor(partygoer, balloon, cupcake)

Be judicious in your use of comments. Use them only when you need to. And make sure that the comments you leave are useful. This is not useful:

for sponge in sponges:
    # apply soap to each sponge
    worker.apply_soap(sponge)

Remember also that every comment you add is as much a maintenance burden as a line of code. Comments that are out-of-date are misleading at best, and dangerous at worst. You have to update them as your code changes to prevent them becoming hazards to your work.

Docstrings

In Python, docstrings are used to provide in-line documentation in a number of places.

The first place we will see is in the definition of functions. To define a function you use the def keyword. If a string literal is the first thing in the function block following the header, it is a docstring:

def complex_function(arg1, arg2, kwarg1=u'bannana'):
    """Return a value resulting from a complex calculation."""
    # code block here

You can then read this in an interpreter as the __doc__ attribute of the function object. It will also be used by the interpreter help system.

In [2]: complex_function.__doc__
Out[2]: 'Return a value resulting from a complex calculation.'
In [3]: complex_function?
Signature: complex_function(arg1, arg2, kwarg1='bannana')
Docstring: Return a value resulting from a complex calculation.
File:      ~/projects/training/codefellows/existing_course_repos/python-dev-accelerator/<ipython-input-1-1def4182e947>
Type:      function

A docstring should be a complete sentence in the form of a command describing what the function does:

“”“Return a list of values based on blah blah”“” is a good docstring. “”“Returns a list of values based on blah blah”“” is not.

A good docstring fits onto a single line. If more description is needed, make the first line a complete sentence and add more lines below for enhancement.

Docstrings should always be enclosed with triple-quotes. This allows you to expand them more easily in the future if required. You should always close the string on the same line if the docstring is only one line.

Python has a styleguide for creating docstrings. You should read it and get familiar. Well-formed docstrings are good evidence of your commitment to your code.

But as with inline comments, please remember that docstrings are a maintenance burden. Always keep your own docstrings up to date as you make changes. And remember that contributing to documentation is a great way to help out an Open Source library.

Recursion

You’ve seen functions that call other functions. A function can also call itself. We call that recursion.

Like with other functions, a call within a call establishes a call stack. With recursion, if you are not careful, this stack can get very deep. Python has a maximum limit to how much it can recurse. This is intended to save your machine from running out of RAM.

Recursion is especially useful for a particular set of problems. For example, take the case of the factorial function. In mathematics, the factorial of an integer is the result of multiplying that integer by every integer smaller than it down to 1. We can use a recursive function nicely to model this mathematical function:

5! == 5 * 4 * 3 * 2 * 1

Try writing this function in Python yourself!

Peek At A Solution

Conditionals and Booleans

Making decisions in programming is quite important. We call the language constructs that support decision making conditionals. Conditionals depend on boolean logic (logic based on True and False). Let’s learn more about how Python handles conditionals and booleans.

Conditionals

Python supports conditionals through the if (py3) statement. It looks an awful lot like if in other languages:

if <expression>:
    <do truthy things>

And like in other languages, there is support for an else (py3) clause. This is executed when the <expression> is falsy:

if <expression>:
    <do truthy things>
else:
    <do falsy things>

Python also supports multiple test expressions through the use of the elif (py3) clause. You may have as many alternate tests as you wish. They are evaluated in order from the top to the bottom. The block of code contained under the first one that matches is executed and all other clauses are ignored.

if <expression1>:
    <do truthy things>
elif <expression2>:
    <do other truthy things>
else:
    <do falsy things>

Make certain you understand the difference between these two programs:

if a:
    print(u'a')
elif b:
    print(u'b')
if a:
    print(u'a')
if b:
    print(u'b')

Notice that the test expression can be any valid Python expression. Remember, evaluating an expression always results in a value. Since all Python values have a boolean value, any valid expression will work.

Also notice that the test expression does not need to be contained in parentheses. This is quite different from most other languages. Only use parentheses in test expressions if you are trying to defeat standard operator precedence.

Switch

Many languages (JavaScript among them) have a switch construct.

switch (expr) {
  case "Oranges":
    document.write("Oranges are $0.59 a pound.<br>");
    break;
  case "Apples":
    document.write("Apples are $0.32 a pound.<br>");
    break;
  case "Mangoes":
  case "Papayas":
    document.write("Mangoes and papayas are $2.79 a pound.<br>");
    break;
  default:
    document.write("Sorry, we are out of " + expr + ".<br>");
}

This form is not present in Python. Instead, you are encouraged to use the if...elif...else conditional construction. Another option is to use a dictionary (more on what that means in our next lesson).

So we can make decisions using if, depending on whether the test statement is true or False. But what does it mean to be true or false in Python?

Booleans

In Python, there are two boolean objects: True and False. Each is an object literal, that is to say, simply writing them as-is evaluates to the object itself.

In the abstract sense, though, the concept of truthiness in Python comes down to the question of “Something or Nothing”. If a value is nothing then it is falsy, otherwise it is truthy.

In a more concrete sense, this is a list of all the things in Python that count as falsy:

  • the None type object

  • the False boolean object

  • Nothing:

    • zero of any numeric type: 0, 0L, 0.0, 0j.
    • any empty sequence, for example, "", (), [].
    • any empty mapping, for example, {} .
    • instances of user-defined classes, if the class defines a __nonzero__() or __len__() method, when that method returns the integer zero or bool value False.

You can read more in the python docs.

Everything else is truthy

Any object in Python, when passed to the bool() type object, will evaluate to True or False. But you rarely need to use this feature yourself. When you use the if (py3) statement, it automatically reads the boolean value of its test expression. Which means that these forms are redundant, and not Pythonic:

# bad
if xx is True:
    do_something()
# worse
if xx == True:
    do_something()
# truly terrible:
if bool(xx) == True:
    do_something()

Instead, you should use what Python gives you:

if xx:
    do_something()

Boolean Operators

Boolean operators allow us to combine and alter boolean values in a number of ways. Python has three boolean operators, and (py3), or (py3) and not (py3). Both and and or are binary operators (require a operand on the left and right of the keyword), and evaluate from left to right.

The and operator will return the first operand that evaluates to False, or the last operand if none are True

In [35]: 0 and 456
Out[35]: 0

The or operator will return the first operand that evaluates to True, or the last operand if none are True

In [36]: 0 or 456
Out[36]: 456

The not operator is unary operator (takes only one operand on the right) and inverts the boolean value of its operand:

In [39]: not True
Out[39]: False

In [40]: not False
Out[40]: True

Shortcutting

Because of the return value of statements with these operators, Python allows very concise (and readable) boolean statements:

                  if x is false,
x or y               return y,
                     else return x

                  if x is false,
x and y              return  x
                     else return y

                  if x is false,
not x                return True,
                     else return False

Chaining

In Python, you can chain these boolean operators. They are evaluated from left to right. The first value that defines the result is returned.

a or b or c or d
a and b and c and d
a and b or c and not d

Ternary Expressions

In most programming languages, this is a fairly common idiom:

if something:
    x = a_value
else:
    x = another_value

In other languages, this can be compressed with a “ternary operator”:

result = a > b ? x : y;

In python, the same is accomplished with the ternary expression:

y = 5 if x > 2 else 3

Boolean Return Values

Remember that Python objects themselves have boolean values. Remember too that boolean expressions will always return an object with a boolean value. Making use of this can lead to some very terse but readable (Pythonic) code:

Consider a function to calculate if you can sleep in (from an exercise at http://codingbat.com). You can sleep in if it is not a weekday or if you are on vacation. You could write this function like so:

def sleep_in(weekday, vacation):
    if weekday == True and vacation == False:
        return False
    else:
        return True

That’s a correct solution. But it’s not a particularly Pythonic way of solving the problem. Here’s a better solution:

def sleep_in(weekday, vacation):
    return not (weekday == True and vacation == False)

But remember that comparing to a boolean is never required in Python. Here’s an even better solution:

def sleep_in(weekday, vacation):
    return (not weekday) or vacation

Note

Pythoon Trivia: the boolean objects are subclasses of integer, so the following holds:

In [1]: True == 1
Out[1]: True
In [2]: False == 0
Out[2]: True

And you can even do math with them (though it’s a bit odd to do so):

In [6]: 3 + True
Out[6]: 4

Simple Looping and Containers

In order to do something interesting for homework, we are going to need to touch on looping and containers. We will visit them more in-depth in a later lesson. This is just a quick introduction

Lists

A list (py3) is a container that stores values in order. It is pretty much like an “array” or “vector” in other languages. We can construct one using the list object literal: []:

a_list = [2, 3, 5, 9]
a_list_of_strings = [u'this', u'that', u'the', u'other']
one, two, three = [1, 2, 3]
newlist = [one, two, three]

You can place values directly into the list, or symbols. If you use symbols, the values to which they are bound are actually stored. This creates another reference to the value, in addition to the reference from the symbol.

Tuples

The tuple (py3) is another container type. It also stores values in order. We construct a tuple using the () object literal:

a_tuple = (2, 3, 4, 5)
a_tuple_of_strings = (u'this', u'that', u'the', u'other')
one, two, three = (1, 2, 3)
newtuple = (one, two, three)

Like lists, you can place values or symbols into a tuple Like lists, placing a symbol stores its value and creates a new reference to that value.

However, tuples are not the same as lists. The exact difference is a topic for next session.

There are other container types, but these two will do for now.

For Loops

The for (py3) statement in Python defines a for loop. The for loop is also sometimes called a ‘determinate’ loop, because it will repeat a determined number of times. You use a for loop when you need to take some action on every item in a container.

In [10]: a_list = [2, 3, 4, 5]

In [11]: for item in a_list:
   ....:     print(item)
   ....:
2
3
4
5

As the loop repeats, each item from the container is bound, successively, to the loop variable. Notice that after the loop has finished, the loop variable is still in scope:

In [12]: item
Out[12]: 5

Range

The range builtin automatically builds a list of numbers. In python 3 it operates differently (more on that in a later lesson). You can use it when you need to perform some operatin a set number of times.

In [12]: range(6)
Out[12]: [0, 1, 2, 3, 4, 5]

In [13]: for i in range(6):
   ....:     print(u'spam', end=u' ')
   ....:
spam spam spam spam spam spam

That will be enough to work with for the time being. Each of these has intricacies we will explore further in later lessons. For now, let’s turn to the issue of the larger organization of our code, and Modules and Packages.

Modules, Packages and Namespaces

In Python, the structure of your code is determined by whitespace. How you indent your code determines how it is structured. We say that Python is whitespace significant

block statement:
    some code body
    some more code body
    another block statement:
        code body in
        that block

The colon that terminates a block statement is also important. You can put a one-liner after the colon:

In [167]: x = 12
In [168]: if x > 4: print(x)
12

But this should only be done if it makes your code more readable.

When indenting your code you could use any number of spaces, a tab, or even a mixture of tabs and spaces. However, if you want anyone to take you seriously as a Python developer, Always use four spaces.

Other than indenting – the spacing in your code doesn’t matter, technically.

x = 3*4+12/func(x,y,z)
x = 3*4 + 12 /   func (x,   y, z)

But you should strive for proper style. Code that is in a uniform, predictable style is easier to parse, and therefore easier to understand. You’ve already installed a linter in your editor so that it can watch over your style. Use it.

And take some time to read the Python style guide, PEP 8.

Beyond the realm of a single Python file, code is organized into modules and packages. But to understand these, we have to talk briefly about namespaces.

Namespaces

Try this in your interpreter:

In [35]: import this

What you see there is “The Zen of Python”. It’s an easter-egg that’s been in Python since version 2.2.1. It comes from an email sent to the Python mailing list in 1999 by Tim Peters.

Notice that last line?

Namespaces are one honking great idea – let’s do more of those!

—The Zen of Python, Tim Peters

Python is all about namespaces. We’ve already met them in the form of local names in the scope of a function. In fact, the reason functions have local names is because like any other object in Python function have a namespace. We can see it by calling the builting function locals (py3) inside a function:

In [1]: def mynamespace(a, b, c=u'default'):
   ...:     print(locals())
   ...:

In [2]: mynamespace(1, 2)
{'c': 'default', 'a': 1, 'b': 2}

We’ve also seen it when we use dir to inspect an object in Python. What you see is the namespace of that object.

Another place we see namespaces is in those dots:

name.another_name

The “dot” indicates that you are looking for another_name in the namespace of the object bound to name. It could be any number of things:

  • name in a module
  • module in a package
  • attribute of an object
  • method of an object

Modules

In Python, a module is a kind of namespace. It might be a single file, or it could be a collection of files that define a shared API. As we have said before, to a first approximation, you can think of the files you write that end in .py as modules.

You can use the import (py3) statement to gain access to the names in a module. In combination with import the from (py3) statement provides a flexible syntax for accessing code. The module must be in your PYTHONPATH. If, for example, there is a module modulename.py in that path, then any of these forms will work:

import modulename

This binds the symbol modulename in the current namespace to the module modulename. All the names in the namespace of that module may be accessed from that module object by the . operator.

from modulename import this, that

This binds the value that are bound to the name this and that in modulename to the same names in the current namespace. No other names from modulename are brought in. And nor is the modulename module, either.

import modulename as a_new_name

This binds the symbol a_new_name in the current namespace to the module modulename. Again, the names in the module namespace may be reference by the . operator from a_new_name.

from modulename import this as that

This binds to the name that in the current namespace the value from modulename that was bound to the symbol this. This import form (and the previous one) alias the objects under new names, and can be useful in the case of name collisions across different modules.

Packages

A package is a module with other modules in it. On a filesystem, this is represented as a directory that contains one or more .py files, one of which must be called __init__.py. A package is also a namespace. You can likewise use import to gain access to the package, the modules it contains, and the names within them.

packagename/
├── __init__.py
└── modulename.py
import packagename.modulename

This binds the module modulename to the name packagename.modulename in the current namespace. Names within the module may be accessed using the . operator from that name.

from packagename.modulename import this, that

This binds the values of this and that in the modulename namespace to the same names in the current namespace. The name packagename.modulename is not bound.

from packagename import modulename

This binds the module modulename to that same name in the current namespace. The name packagename is not bound.

For more information, you can read this article on Python imports.

Import

When you import a module, or a symbol from a module, the Python code is compiled to bytecode. The result is a .pyc file. In Python 2, these files are alongside the .py files. In Python 3, they go in a special folder called __pycache__.

This process executes all code at the module scope. For this reason, it’s a very good idea to avoid statements at module-scope that have global side-effects.

The code in a module is NOT re-run when imported again. Python is aware that bytecode exists and uses it directly. The module must be explicitly reloaded (py3) to be re-run.

import modulename
reload(modulename)

Be careful when doing this. It can have unexpected effects if you are working with multiple modules that import each-other.

Running a Module

In addition to importing modules, you can run them. We have seen this briefly before. There are a few ways to do this:

  • $ python hello.py – must be in current working directory

  • $ python -m hello – any module on PYTHONPATH anywhere on the system

  • $ ./hello.py – put #!/usr/bin/env python at top of module (Unix)

  • In [149]: run hello.py – at the IPython prompt – running a module brings its names into the interactive namespace

Like importing, running a module executes all statements at the module level. But there’s an important difference. Every module has a __name__ symbol in it’s namespace When you import the module, that symbol is bound to the name of the module file. But when you run a module, it is bound to the string "__main__".

This allows you to create blocks of code protected by a conditional that checks for this. The contained code is only run when the module is run.

if __name__ == '__main__':
    # Do something interesting here
    # It will only happen when the module is run

Main Blocks

This pattern is very common. It’s useful in a number of cases. You can put code here that lets your module be a utility script. You can put code here that demonstrates the functions contained in your module. And you can put code here that proves that your module code works.

Assert

Writing tests that demonstrate that your program works is an important part of learning to program. The python assert (py3) statement is useful in writing simple main blocks that test your code. It is followed by a Python expression which is evaluated for its boolean value. If the value is False, an AssertionError is raised.

# calculations.py
def add(n1, n2):
    """return the sum of n1 and n2"""
    return n1 + n2

if __name__ == '__main__':
    # adding produces the right sum
    assert add(3, 4) == 7
    # adding does not produce the wrong sum
    assert add(3, 4) != 10

We’ll learn more about testing soon.