Understanding Python Errors

Reading errors

By now, you have likely encountered Python errors very often. Here is an example of an error:

In [ ]:
s = 'hello'
s[0] = 'a'

This is called an exception traceback. The exception is the error itself, and the traceback is the information that shows where it occured. The above traceback is quite simple (because the code producing the error is quite simple). The most important thing you should look at in an error is the last line, in this case:

'str' object does not support item assignment

Now let's look at a slightly more complex error:

In [ ]:
import numpy as np

def subtract_smooth(x, y):
    y_new = y - median_filter(x, y, 2.5)
    return y_new

def median_filter(x, y, width):
    y_new = np.zeros(y.shape)
    for i in range(len(x)):
        y_new[i] = np.median(y[np.abs(x - x[i]) < width * 0.5])
    return y_new
In [ ]:
subtract_smooth(np.array([1,2,3,4,5]),np.array([4,5,6,8]))

The error is now more complex. The first line shows what top-level code was executed when the error occured - in this case the call to subtract_smooth:

 IndexError                                Traceback (most recent call last)
 <ipython-input-3-d9260f0fd73b> in <module>
 ----> 1 subtract_smooth(np.array([1,2,3,4,5]),np.array([4,5,6,8]))

The next chunk shows where the error occured inside subtract_smooth:

<ipython-input-2-149558127e8f> in subtract_smooth(x, y)
      2 
      3 def subtract_smooth(x, y):
 ----> 4     y_new = y - median_filter(x, y, 2.5)
      5     return y_new


you can see it happened when calling median_filter. Finally, we can see where the error occured inside median_filter:

<ipython-input-2-149558127e8f> in median_filter(x, y, width)
      8     y_new = np.zeros(y.shape)
      9     for i in range(len(x)):
---> 10         y_new[i] = np.median(y[np.abs(x - x[i]) < width * 0.5])
     11     return y_new

So tracebacks show you the full history of the error!

Now in the above case, the final error is:

IndexError: boolean index did not match indexed array along dimension 0; dimension is 4 but corresponding     
boolean dimension is 5

Why is this occuring? The only place that boolean indices are used here is when doing:

np.abs(x - x[i]) < width * 0.5

The issue is that if we look back at the original function call, there are more values for x than for y!

Using the debugger

In the above example, the code was still simple enough that we could guess the solution, but sometimes things are not so simple. One way to diagnose the issue would have been to print out the content of the variables in median_filter and run it again to see what was going on.

However, Python includes a debugger, which allows you to jump right in to where the error happened, and look at the variables. In the IPython notebook or in IPython, once an error has happened, you can run %debug, and you will see a ipdb> prompt (IPython debugger). You can then print out variables to see what they are set to. Let's try the above example again:

In [ ]:
subtract_smooth(np.array([1,2,3,4,5]),np.array([4,5,6,8]))
In [ ]:
%debug

We can see that the boolean array that is being used as indices to y is too big. Much simpler! Type exit to exit the debugger.

Catching exceptions

In some cases, we know that errors might happen, and we don't want them to crash the code. For example, if we have to read in 1000 files, a few might be corrupt, and we just want to skip over them. Or we want to try something, and if it doesn't work, do something else. To do this, we can catch exceptions:

In [ ]:
s = 'hello'
In [ ]:
s[1] = 'a'
In [ ]:
try:
    s[1] = 'a'
except:
    print("Can't set s[1]")

The try...except contruct above catches all exceptions, but in some cases we want to be a bit more specific. The error that occurs above is a TypeError, which is just one kind of error (others include ValueError, SystemError, etc.). To catch just TypeError, you can do:

In [ ]:
try:
    s[1] = 'a'
except TypeError:
    print("Can't set s[1]; TypeError")

If you catch other errors, TypeError will pass

In [ ]:
try:
    s[1] = 'a'
except ValueError:    
    print("Can't set s[1]")

Catching exceptions is tricky business, however, because it can hide important hints for debugging.

Raising Errors

If something goes wrong in your code, rather than just printing some warning and going on, it's usually much better to raise an exception yourself. You'll make sure execution halts where an error has occurred, rather than having confusing junk results much later, when it is hard to figure out what went wrong where.

Raising exceptions is not hard:

In [ ]:
raise ValueError("But writing good, descriptive error messages is.")

raise is another reserved word in Python. What's behind it is an exception class; you can define these yourself once you know how to deal with objects and inheritance, but there's nothing wrong with re-using exceptions you know from the standard library: ValueError if some value doesn't meet your expectation, IOError if something is wrong with a file, KeyError if a lookup of something failed... You can always use Exception (which is a “base class” of all these exceptions) if unsure.

Note, however, that it's almost always a bad idea to catch an exception and re-raise one of your design. You'll most likely be hiding the root cause of the problem, making analysis of the problem much harder.

Also, don't claim more than you know in your error message (and don't execute the following code if you're running your notebook as root (which of course one should not do anyway!)

In [ ]:
# THIS IS HOW *NOT* TO DO IT!
try:
    with open("/etc/passwd", "a") as f:
        pass
except IOError: 
    # don't do this anyway: Catch and raise something else
    # And in particular don't claim more than you know.
    raise Exception("/etc/passwd does not exist")

At least on unix systems, the existence claim is patently untrue, as you can verify with ls. What's failed is that you're not allowed to write to that file. By claiming more than you actually know in your error message, you will confuse users.