I l@ve RuBoard

6.9 Odds and Ends

6.9.1 Private Attributes (New in 1.5)

In the last chapter, we noted that every name assigned at the top level of a file is exported by a module. By default, the same holds for classes; data hiding is a convention, and clients may fetch or change any class or instance attribute they like. In fact, attributes are all public and virtual in C++ terms; they're all accessible everywhere and all looked up dynamically at runtime.

At least until Python 1.5. In 1.5, Guido introduced the notion of name mangling to localize some names in classes. Private names are an advanced feature, entirely optional, and probably won't be very useful until you start writing large class hierarchies. But here's an overview for the curious.

In Python 1.5, names inside a class statement that start with two underscores (and don't end with two underscores) are automatically changed to include the name of the enclosing class. For instance, a name like __X in a class Class is changed to _Class __ X automatically. Because the modified name includes the name of the enclosing class, it's somewhat unusual; it won't clash with similar names in other classes in a hierarchy.

Python mangles names wherever they appear in the class. For example, an instance attribute called self. __X is transformed to self._Class _ _ X, thereby mangling an attribute name for instance objects too. Since more than one class may add attributes to an instance, name mangling helps avoid clashes automatically.

Name mangling happens only in class statements and only for names you write with two leading underscores. Because of that, it can make code somewhat unreadable. It also isn't quite the same as private declarations in C++ (if you know the name of the enclosing class, you can still get to mangled attributes!), but it can avoid accidental name clashes when an attribute name is used by more than one class of a hierarchy.

6.9.2 Documentation Strings

Now that we know about classes, we can tell what those __ doc __ attributes we've seen are all about. So far we've been using comments that start with a # to describe our code. Comments are useful for humans reading our programs, but they aren't available when the program runs. Python also lets us associate strings of documentation with program-unit objects and provides a special syntax for it. If a module file, def statement, or class statement begins with a string constant instead of a statement, Python stuffs the string into the __doc__ attribute of the generated object. For instance, the following program defines documentation strings for multiple objects:

"I am: docstr.__doc__"

class spam:
    "I am: spam.__doc__ or docstr.spam.__doc__"

    def method(self, arg):
        "I am: spam.method.__doc__ or self.method.__doc__"
        pass

def func(args):
    "I am: docstr.func.__doc__"
    pass

The main advantage of documentation strings is that they stick around at runtime; if it's been coded as a documentation string, you can qualify an object to fetch its documentation.

>>> import docstr
>>> docstr.__doc__
'I am: docstr.__doc__'
>>> docstr.spam.__doc__
'I am: spam.__doc__ or docstr.spam.__doc__'
>>> docstr.spam.method.__doc__
'I am: spam.method.__doc__ or self.method.__doc__'
>>> docstr.func.__doc__
'I am: docstr.func.__doc__'

This can be especially useful during development. For instance, you can look up components' documentation at the interactive command line as done above, without having to go to the source file to see # comments. Similarly, a Python object browser can take advantage of documentation strings to display descriptions along with objects.

On the other hand, documentation strings are not universally used by Python programmers. To get the most benefit from them, programmers need to follow some sort of conventions in their documentation styles, and it's our experience that these sorts of conventions are rarely implemented or followed in practice. Further, documentation strings are available at runtime, but they are also less flexible than # comments (which can appear anywhere in a program). Both forms are useful tools, and any program documentation is a good thing, as long as it's accurate.

6.9.3 C lasses Versus Modules

Finally, let's step back for a moment and compare the topics of the last two chapters—modules and classes. Since they're both about namespaces, the distinction can sometimes be confusing. In short:

Modules

Are data/logic packages
Are created by writing Python files or C extensions
Are used by being imported

Classes

Implement new objects
Are created by class statements
Are used by being called
Always live in a module

Classes also support extra features modules don't, such as operator overloading, multiple instances, and inheritance. Although both are namespaces, we hope you can tell by now that they're very different animals.

I l@ve RuBoard