A Concise Guide to Python 2
© David Matuszek, 2011

Table of contents

Introduction

Python is open source. It is available from http://www.python.org/download/. At the time of this writing, Python 2 and Python 3 are both in use, but Python 2 is still more popular. Python comes with a nice (if very basic) IDE, called idle.

This document describes some of the more commonly useful Python functions and methods. There are very many additional functions and methods, and some of the ones described herein take additional, optional parameters to make them even more useful. See The Python Standard Library documentation for more detailed information.

Formatting is significant

All programming languages have some way of grouping statements, usually braces, {...}, or begin...end. It is then the programmer's responsibility to adjust the indentation to match. Python dispenses with the braces and uses the indentation itself to indicate grouping. For example, 
if 2 + 2 == 4:
    print 'Arithmetic works!'
    print '...as expected.'
else:
    print 'Somebody goofed up somewhere!'

The first line in a program may not be indented. Lines in the same block must be indented the same amount (4 spaces is standard). Tab characters may be used, but are discouraged, and may not be legal in a future version of Python. Any good editor will have an option to replace tabs with spaces as you type.

Each simple statement is written on a separate line. If a line contains an unclosed '(', '[', or '{', it is continued on the next line; otherwise, a line may be continued by ending it with a backslash, '\'. You can put multiple statements on a line if you separate them with semicolons, but this is discouraged.

Conventions

Coding conventions

(Major points summarized from PEP 8.)

Naming conventions

Documentation string conventions

(Major points summarized from PEP 257.)

Identifiers

Identifiers begin with a letter or underscore and may contain letters, underscores, and digits; case is significant.

Conventions:

Methods and functions

Functions defined within a class are called methods. Function calls are written as functionName(arguments), for example, abs(-5). Method calls are written as object.methodName(arguments), for example, 'abc'.upper(). All values are objects, and may have methods.

Built-in object types

Numbers

An integer consists of a sequence of digits. It is decimal (base 10) unless the first digit is zero.

A floating point ("real") number includes a decimal point, an exponent suffix, or both. The exponent consists of an optional sign, the letter e or E, and one or more digits.

An imaginary number consists of a decimal integer or a floating point number, suffixed by j (not i) or J. A complex number consists of the sum or difference of an integer or floating-point number and an imaginary number.

A leading sign, if present, is not counted as part of the number, but as an operator.

Functions on numbers

Strings

A string is a sequence of zero or more characters, enclosed in single quotes ('...'), double quotes ("..."), triple single quotes ('''...'''), or triple double quotes ("""..."""). It can be treated as a sequence of characters.

Triply-quoted strings may extend across several lines, and include the line breaks as part of the string, unless the line break is preceded by a backslash ('\').

A string may contain "escaped" characters; the most important ones are \' (single quote), \" (double quote), \\ (backslash), \t (tab), and \n (newline).

A raw string is a string prefixed with r or R. In a raw string the backslash does not escape characters; all characters stand for themselves.

A Unicode string is prefixed with u or U. Unicode strings may contain Unicode characters, designated by \uXXXX, where the Xs are four hexadecimal digits, or by \N{name}, where the name is a standard Unicode name for a character.

Raw Unicode strings are prefixed by ur, but not by ru.

A string occurring by itself (not part of some other statement) as the first line within a function, method, or class, is a documentation string, and is stored in the variable named __doc__.

Functions on strings

String methods

Boolean values

Python has the values True and False, but in addition, all numeric zero values are false, and all nonzero values are true. In a numeric expression, True has the value 1 and False has the value 0. The special value None indicates "no value," and is treated as false.

Python has the boolean operators not, and, and or, with the usual meanings. The and and or operators are short-circuit; that is, the second operand is evaluated only if necessary.

Sequences

A string is an immutable sequence of characters (see above).

A tuple is an immutable sequence of values, not necessarily all the same type, enclosed in parentheses, (). To distinguish a one-tuple from a parenthesized expression, put a comma after the single value, for example, ("just one",).

A list is a mutable sequence of values, not necessarily all the same type, enclosed in brackets, [].

A range is an iterator that produces a sequence of integers. range(stop) produces the integers 0..stop-1; range(start, stop) produces the integers start...stop-1; and range(start, stop, step) produces the integers start, start+step,... up to but not including stop. The step may be negative.

For any sequence or range seq:

For any sequence (but not ranges), all the above subsequences are assignable; that is, they may occur on the lefthand side of the assignment operator, =.

All ranges and sequence types are iterable. Iterating through a string produces single characters.

Other iterable data types

The following types are iterable but are not sequences, as the order in which values are stored is determined by their hash codes (dictionary: hash code of the keys), not lexicographically, or by the order in which they are inserted by the programmer.

A set is one or more values, not necessarily all the same type, enclosed in braces, {}. The empty set cannot be written as {}; use set({}) instead.

A dictionary is zero or more key:value pairs, enclosed in braces, {}. Keys must be immutable.

Functions on iterable data types

List comprehensions

A list comprehension is a way of computing a list from a sequence. There are two general forms:

Functions on general objects

Operators

Arithmetic Comparisons
+   addition (also unary plus) <   less than
-   subtraction (also unary minus) <=   less than or equal to
*   multiplication ==   equal to
/   division
   (integer division if both operands are integer)
<>   not equal to
!=   not equal to (deprecated)
%   modulus >=   greater than or equal to
**   exponentiation (right associative) >   greater than

Comparisons can be chained, for example, 0 < x <= 100.

Boolean (Logical) Bit operations
not   negation ~   bitwise complement
and   conjunction (short circuit) &   bitwise and
or   disjunction (short circuit) |   bitwise or
is   identity ^   bitwise exclusive or
is not   non-identity <<   left shift
in   membership >>   right shift
not in   non-membership  

Statements

Statement type Examples Comments

assert expression1
assert expression1expression2
assert 2 + 2 == 4
assert x > 0, "Bad value"
Raises an AssertException if expression1 is false, and has no effect otherwise. The optional expression2 is used as a message by the AssertException.
variable = expression count = 0 Assignment.
var1, ..., varN = val1, ..., valN name, age = 'Bill', 43 Multiple (simultaneous) assignment
variable op= expression count += 1 Augmented assignment. v op= e is equivalent to v = v op e, where the op may be any of + - * / ** << >> & | ^.
if expression:
    statements
elif expression:
    statements
else:
    statements
if x > y:
    print "x is bigger"
elif y > x:
    print "y is bigger"
Multiple elifs may be used. Using elif instead of else/if saves a level of indentation, and makes parallel cases appear parallel rather than nested.
for variable in sequence:
    statements
else:
    statements
for x in (5, 4, 3, 2, 1):
    print x,
else:
    print "Blast off!"
The else: clause is done when the loop exits.
while expression:
    statements
else:
    statements
while x < 1000:
    x *= 2
else:
    print x
The optional else: clause is done when the loop exits.
break break Exits the immediately enclosing loop; the else part, if present, is skipped.
continue continue Starts the next iteration of the loop.
pass pass Does nothing. Occasionally needed due to the use of formatting rather than punctuation.
raise exception
raise exception, value
raise exception, value, trace
raise ValueError Raises an exception. The value is any expression, used as a message in the exception; the trace must be a traceback object.
try:
    statements
except expression as variable:
    statements
else:
    statements
try:
   average = sum / count
except ZeroDivisionError as e:
   print("divided by zero!")
else:
   print("average = ", average)
The expression and variable are optional. The optional else clause, if present, is executed if no exception was raised.
try:
    statements
except expression as variable:
    statements
finally:
    statements
try:
    average = sum / count
except ZeroDivisionError:
    print "divided by zero!"
finally:
    print "Done."
The expression and as variable are optional. The optional finally clause, if present, is executed whether or not an exception was raised.
del variable del x Deletes a variable.
exec expression exec "print 'hello'" Executes a string, open file object, or code object.
import module
from module import names
import module as name

from module import name1 as name2
from java import util Names may be changed when imported.
global var1, ..., varN global max, min Any variable given a value within a function is local to that function unless declared global in the function.
return
return value
return result As in Java.
class name:
    statements
class name (base1, ..., baseN)
:
    statements
class student (person) Declares a class and tells what classes it extends.
def name (parameters):
    "Documentation string"
    statements
def sum (x, y):
    return x + y
Declares a function. The Documentation string tells the purpose of the function.

Functions

The simplest form of a function is:

    def functionName(parameter1, ..., parameterN):
        """Optional but strongly recommended documentation string goes here."""
        # Statements, possibly including a return statement

Parameters must be simple names, not expressions. The function can be called by giving its name and values for the parameters, for example, average(3.5, y). Parentheses are required, in both the function definition and the function call, even if the function does not take parameters.

Note: Parameters in the function definition are sometimes called "formal parameters"; parameters in the call to a function are sometimes called "actual parameters" or "arguments". Sometimes formal parameters are also called "arguments", though this is an incorrect usage.

Every function returns a value. Use statements of the form return value to specify the value to return. Omit the value, or simply "flow off" the end of the function, to return the special value None.

In Python functions are values, and can be assigned to variables. For example, absval = abs makes absval a synonym for abs, and absval(-5) will return 5.

Scope

The scope of a name is the part of the program in which the name has meaning and can be used.

To simplify somewhat, Python has two scopes: local and global. Local variables are those defined within a function or method, and are available only within that function or method. Global variables are those available throughout the program. Local and global variables are in different namespaces, so the same name can be used both locally and globally, with different values (though this is confusing and should be avoided).

Terminology: To "read" a variable is to get the value it contains; to "write to" a variable is to change that value.

Python's rules are goofy somewhat unusual.

Input and output

Interactive I/O

raw_input() or raw_input(prompt) reads one line, as a string, from the user.

input() or input(prompt) reads one line, as a string, from the user, and tries to evaluate it.

print expr1, ..., exprN evaluates and prints the expressions, separated by spaces and followed by a newline. If there is a comma after exprN, the newline is omitted.

File I/O

To do file I/O you must (1) open the file, (2) use the file, and (3) close the file.

Text (default, non-binary) mode converts platform-specific line endings to the current platform. This will corrupt binary files. Text files, on the other hand, can be read and written as binary without harm.

Unit testing

Unit tests and test cases

A unit test is a test of the methods in a single class. A test case tests the response of a single method to a particular set of inputs. To do unit testing,

  1. import unittest
  2. import fileToBeTested
  3. class NameOfTestClass(unittest.TestCase):
    1. Define def setUp(self) and def tearDown(self) if wanted.
    2. Provide one or more def testSomething(self) methods (names must begin with test).

setUp is called before each test case, and should initialize everything to a "clean" state.
tearDown is called after each test case, to remove artifacts (such as files) that may have been created.

The following are some of the methods available in test methods:

Typically b and x are calls to the method being tested, while a is the expected result.

An easy way to run the tests is to put the following code at the end of the test file:

if __name__ == '__main__':
    unittest.main()

Test suites

Unit tests can be combined into a test suite. If file testfoo.py contains the class TestFoo, and file testbar.py contains the class TestBar, the test suite can be written like this:

import unittest
import testfoo, testbar

def suite():
    suite = unittest.TestSuite()
    suite.addTest(unittest.makeSuite(testfoo.TestFoo))
    suite.addTest(unittest.makeSuite(testbar.TestBar))
    return suite

if __name__ == '__main__':
    test_suite = suite()
    runner = unittest.TextTestRunner()
    runner.run (test_suite)

Classes and methods

A class describes a new type of object, and bundles together the methods used for working with objects of that type. The syntax for defining a class is
    class ClassName(NameOfParentClass):
        variable and method definitions

where NameOfParentClass is the name of the superclass (usually object).

To create an instance (object) of a class, use the name of the class as if it were a function name.

The class being defined inherits all the variables and methods of the parent class.

To allow parameterized creation of new objects, define a method named __init__ within the class. The first (required) parameter of this method is conventionally named self and refers to the object being created. To create an instance, do not call __init__; use the name of the class as if it were a function name, and supply values for all the parameters except self.

To call a method defined in an object (more properly called "sending a message to the object"), use the syntax object.method(arguments). To call a method from another method defined within the same object, use the syntax self.method(arguments).

To access a variable defined in an object from within the same object, use the syntax self.variable. To directly access a variable defined in a different object, it is legal to say object.variable, but this is frowned upon.

If you define a method __str__(self), this method will be called when the object is printed; therefore, it should return a string.

For comparing objects, you may define the methods __lt__(self, other), __le__(self, other), __eq__(self, other), __ne__(self, other), __ge__(self, other), __gt__(self, other), and these are used by the corresponding Python operators. There is are no implied relationships among these methods.