I l@ve RuBoard

4.6 Function Gotchas

Here are some of the more jagged edges of functions you might not expect. They're all obscure, but most have been known to trip up a new user.

4.6.1 Local Names Are Detected Statically

As we've seen, Python classifies names assigned in a function as locals by default; they live in the function's scope and exist only while the function is running. What we didn't tell you is that Python detects locals statically, when it compiles the code, rather than by noticing assignments as they happen at runtime. Usually, we don't care, but this leads to one of the most common oddities posted on the Python newsgroup by beginners.

Normally, a name that isn't assigned in a function is looked up in the enclosing module:

>>> X = 99
>>> def selector():        # X used but not assigned
...     print X            # X found in global scope
...
>>> selector()
99

Here, the X in the function resolves to the X in the module outside. But watch what happens if you add an assignment to X after the reference:

>>> def selector():
...     print X              # does not yet exist!
...     X = 88               # X classified as a local name (everywhere)
...                          # can also happen if "import X", "def X",...
>>> selector()
Traceback (innermost last):
  File "<stdin>", line 1, in ?
  File "<stdin>", line 2, in selector
NameError: X

You get an undefined name error, but the reason is subtle. Python reads and compiles this code when it's typed interactively or imported from a module. While compiling, Python sees the assignment to X and decides that X will be a local name everywhere in the function. But later, when the function is actually run, the assignment hasn't yet happened when the print executes, so Python says you're using an undefined name. According to its name rules, it should; local X is used before being assigned.^[5]

^[5] In fact, any assignment in a function body makes a name local: import, =, nested defs, nested classes, and so on.

4.6.1.1 Solution

The problem occurs because assigned names are treated as locals everywhere in a function, not just after statements where they are assigned. Really, the code above is ambiguous at best: did you mean to print the global X and then create a local X, or is this a genuine programming error? Since Python treats X as a local everywhere, it is an error; but if you really mean to print global X, you need to declare it in a global statement:

>>> def selector():
...     global X           # force X to be global (everywhere)
...     print X
...     X = 88
...
>>> selector()
99

Remember, though, that this means the assignment also changes the global X, not a local X. Within a function, you can't use both local and global versions of the same simple name. If you really meant to print the global and then set a local of the same name, import the enclosing module and qualify to get to the global version:

>>> X = 99
>>> def selector():
...     import __main__      # import enclosing module
...     print __main__.X     # qualify to get to global version of name
...     X = 88               # unqualified X classified as local
...     print X              # prints local version of name
...
>>> selector()
99
88

Qualification (the .X part) fetches a value from a namespace object. The interactive namespace is a module called _ _main__, so __main_ _.X reaches the global version of X. If that isn't clear, check out Chapter 5.

4.6.2 Nested Functions Aren't Nested Scopes

As we've seen, the Python def is an executable statement: when it runs, it assigns a new function object to a name. Because it's a statement, it can appear anywhere a statement can—even nested in other statements. For instance, it's completely legal to nest a function def inside an if statement, to select between alternative definitions:

if test:
    def func():          # define func this way
        ... 
else:
    def func():          # or else this way instead
        ...
...
func()

One way to understand this code is to realize that the def is much like an = statement: it assigns a name at runtime. Unlike C, Python functions don't need to be fully defined before the program runs. Since def is an executable statement, it can also show up nested inside another def. But unlike languages such as Pascal, nested defs don't imply nested scopes in Python. For instance, consider this example that defines a function (outer), which in turn defines and calls another function (inner) that calls itself recursively:^[6]

^[6] By "recursively," we mean that the function is called again, before a prior call exits. In this example, the function calls itself, but it could also call another function that calls it, and so on. Recursion could be replaced with a simple while or for loop here (all we're doing is counting down to zero), but we're trying to make a point about self-recursive function names and nesting. Recursion tends to be more useful for processing data structures whose shape can't be predicted when you're writing a program.

>>> def outer(x):
...     def inner(i):          # assign in outer's local
...         print i,           # i is in inner's local
...         if i: inner(i-1)   # not in my local or global!
...     inner(x)
...
>>> outer(3)
3
Traceback (innermost last):
  File "<stdin>", line 1, in ?
  File "<stdin>", line 5, in outer
  File "<stdin>", line 4, in inner
NameError: inner

This won't work. A nested def really only assigns a new function object to a name in the enclosing function's scope (namespace). Within the nested function, the LGB three-scope rule still applies for all names. The nested function has access only to its own local scope, the global scope in the enclosing module, and the built-in names scope. It does not have access to names in the enclosing function's scope; no matter how deeply functions nest, each sees only three scopes.

For instance, in the example above, the nested def creates the name inner in the outer function's local scope (like any other assignment in outer would). But inside the inner function, the name inner isn't visible; it doesn't live in inner's local scope, doesn't live in the enclosing module's scope, and certainly isn't a built-in. Because inner has no access to names in outer's scope, the call to inner from inner fails and raises an exception.

4.6.2.1 Solution

Don't expect scopes to nest in Python. This is really more a matter of understanding than anomaly: the def statement is just an object constructor, not a scope nester. However, if you really need access to the nested function name from inside the nested function, simply force the nested function's name out to the enclosing module's scope with a global declaration in the outer function. Since the nested function shares the global scope with the enclosing function, it finds it there according to the LGB rule:

>>> def outer(x):
...     global inner
...     def inner(i):          # assign in enclosing module
...         print i,
...         if i: inner(i-1)   # found in my global scope now
...     inner(x)
...
>>> outer(3)
3 2 1 0

4.6.3 Using Defaults to Save References

Really, nested functions have no access to any names in an enclosing function, so this is actually a more general gotcha than the example above implies. To get access to names assigned prior to the nested function's def statement, you can also assign their values to the nested function's arguments as defaults. Because default arguments save their values when the def runs (not when the function is actually called), they can squirrel away objects from the enclosing function's scope:

>>> def outer(x, y):
...     def inner(a=x, b=y):     # save outer's x,y bindings/objects
...         return a**b          # can't use x and y directly here
...     return inner
...
>>> x = outer(2, 4)
>>> x()
16

Here, a call to outer returns the new function created by the nested def. When the nested def statement runs, inner's arguments a and b are assigned the values of x and y from the outer function's local scope. In effect, inner's a and b remembers the values of outer's x and y. When a and b are used later in inner's body, they still refer to the values x and y had when outer ran (even though outer has already returned to its caller).^[7] This scheme works in lambdas too, since lambdas are really just shorthand for defs:

^[7] In computer-science lingo, this sort of behavior is usually called a closure?/i>an object that remembers values in enclosing scopes, even though those scopes may not be around any more. In Python, you need to explicitly list which values are to be remembered, using argument defaults (or class object attributes, as we'll see in Chapter 6).

>>> def outer(x, y): ... return lambda a=x, b=y: a**b ... >>> y = outer(2, 5) >>> y() 32

Note that defaults won't quite do the trick in the last section's example, because the name inner isn't assigned until the inner def has completed. Global declarations may be the best workaround for nested functions that call themselves:

>>> def outer(x): ... def inner(i, self=inner): # name not defined yet ... print i, ... if i: self(i-1) ... inner(x) ... >>> outer(3) Traceback (innermost last): File "<stdin>", line 1, in ? File "<stdin>", line 2, in outer NameError: inner

But if you're interested in exploring the Twilight Zone of Python hackerage, you can instead save a mutable object as a default and plug in a reference to inner after the fact, in the enclosing function's body:

>>> def outer(x): ... fillin = [None] ... def inner(i, self=fillin): # save mutable ... print i, ... if i: self[0](i-1) # assume it's set ... fillin[0] = inner # plug value now ... inner(x) ... >>> outer(3) 3 2 1 0

Although this code illustrates Python properties (and just might amaze your friends, coworkers, and grandmother), we don't recommend it. In this example, it makes much more sense to avoid function nesting altogether:

>>> def inner(i): # define module level name ... print i, ... if i: inner(i-1) # no worries: it's a global ... >>> def outer(x): ... inner(x) ... >>> outer(3) 3 2 1 0

As a rule of thumb, the easy way out is usually the right way out.

4.6.4 Defaults and Mutable Objects

D efault argument values are evaluated and saved when the def statement is run, not when the resulting function is called. That's what you want, since it lets you save values from the enclosing scope, as we've just seen. But since defaults retain an object between calls, you have to be careful about changing mutable defaults. For instance, the following function uses an empty list as a default value and then changes it in place each time the function is called:

>>> def saver(x=[]): # saves away a list object ... x.append(1) # changes same object each time! ... print x ... >>> saver([2]) # default not used [2, 1] >>> saver() # default used [1] >>> saver() # grows on each call [1, 1] >>> saver() [1, 1, 1]

The problem is that there's just one list object here—the one created when the def was executed. You don't get a new list every time the function is called, so the list grows with each new append.

4.6.4.1 Solution

If that's not the behavior you wish, simply move the default value into the function body; as long as the value resides in code that's actually executed each time the function runs, you'll get a new object each time through:

>>> def saver(x=None): ... if x is None: # no argument passed? ... x = [] # run code to make a new list ... x.append(1) # changes new list object ... print x ... >>> saver([2]) [2, 1] >>> saver() # doesn't grow here [1] >>> saver() [1]

By the way, the if statement above could almost be replaced by the assignment x = x or [], which takes advantage of the fact that Python's or returns one of its operand objects: if no argument was passed, x defaults to None, so the or returns the new empty list on the right. This isn't exactly the same, though: when an empty list is passed in, the function extends and returns a newly created list, rather than extending and returning the passed-in list like the previous version (the expression becomes [] or [], which evaluates to the new empty list on the right; see the discussion of truth tests in Chapter 3 if you don't recall why). Since real program requirements may call for either behavior, we won't pick a winner here.

I l@ve RuBoard