I l@ve RuBoard

Solution

We'll start off with the deceptively simple question: What functions are "part of" a class, or make up the interface of a class?

The deeper questions are:

How does this answer fit with C-style object-oriented programming?
How does it fit with C++'s Koenig lookup? With the Myers Example? (I'll describe both.)
How does it affect the way we analyze class dependencies and design object models?

So what's in a class? Here's the definition of "class" again:

A class describes a set of data, along with the functions that operate on that data.

Programmers often unconsciously misinterpret this definition, saying instead: "Oh yeah, a class, that's what appears in the class definition梩he member data and the member functions." But that's not the same thing, because it limits the word functions to mean just member functions. Consider:



//*** Example 1 (a) 


class X { /*...*/ };


/*...*/


void f( const X& );

The question is: Is f part of X? Some people will automatically say "No" because f is a nonmember function (or free function). Others might realize something fundamentally important: If the Example 1 (a) code appears together in one header file, it is not significantly different from:



//*** Example 1 (b) 


class X


{


  /*...*/


public:


  void f() const;


};

Think about this for a moment. Besides access rights,^[2] f is still the same, taking a pointer/reference to X. The this parameter is just implicit in the second version, that's all. So, if Example 1 (a) all appears in the same header, we're already starting to see that even though f is not a member of X, it's nonetheless strongly related to X. I'll show exactly what that relationship is in the next section.

^[2] Even those may be unchanged if the original f was a friend.

On the other hand, if X and f do not appear together in the same header file, then f is just some old client function, not a part of X (even if f is intended to augment X). We routinely write functions with parameters whose types come from library headers, and clearly our custom functions aren't part of those library classes.

With that example in mind, I propose the Interface Principle:

For a class X, all functions, including free functions, that both

"Mention" X

Are "supplied with" X

are logically part of X, because they form part of the interface of X.

By definition, every member function is "part of" X:

Every member function must "mention" X (a nonstatic member function has an implicit this parameter of type X* const or const X* const; a static member function is in the scope of X).
Every member function must be "supplied with" X (in X's definition).

Applying the Interface Principle to Example 1 (a) gives the same result as our original analysis. Clearly, f mentions X. If f is also "supplied with" X (for example, if they come in the same header file and/or namespace^[3]), then according to the Interface Principle, f is logically part of X because it forms part of the interface of X.

^[3] We'll examine the relationship with namespaces in detail momentarily, because it turns out that this Interface Principle behaves in exactly the same way that Koenig lookup does.

So the Interface Principle is a useful touchstone to determine what is really "part of" a class. Do you find it unintuitive that a free function should be considered part of a class? Then let's give real weight to this example by giving a more common name to f.



//*** Example 1 (c) 


class X { /*...*/ };


/*...*/


ostream& operator<<( ostream&, const X& );

Here the Interface Principle's rationale is perfectly clear, because we understand how this particular free function works. If operator<< is "supplied with" X (for example, in the same header and/or namespace), then operator<< is logically part of X because it forms part of the interface of X. That makes sense even though the function is a nonmember, because we know that it's common practice for a class's author to provide operator<<. If, instead, operator<< comes, not from X's author, but from client code, then it's not part of X because it's not "supplied with" X.^[4]

^[4] The similarity between member and nonmember functions is even stronger for certain other overloadable operators. For example, when you write "a+b" you might be asking for a.operator+(b) or operator+(a,b), depending on the types of a and b.

In this light, then, let's return to the traditional definition of a class:

A class describes a set of data, along with the functions that operate on that data.

That definition is exactly right, for it doesn't say a thing about whether the "functions" in question are members or not.

I've been using C++ terms like "namespace" to describe what "supplied with" means, so is the Interface Principle C++-specific? Or is it a general OO principle that can apply in other languages?

Consider a familiar example from another (in fact, a non-OO) language.



/*** Example 2 (a) ***/ 


struct _iobuf { /*...data goes here...*/ };


typedef struct _iobuf FILE;


FILE* fopen ( const char* filename,


              const char* mode );


int   fclose( FILE* stream );


int   fseek ( FILE* stream,


              long  offset,


              int   origin );


long  ftell ( FILE* stream );


     /* etc. */

This is the standard "handle technique" for writing OO code in a language that doesn't have classes. You provide a structure that holds the object's data and functions梟ecessarily nonmembers梩hat take or return pointers to that structure. These free functions construct (fopen), destroy (fclose), and manipulate (fseek, ftell, and so forth) the data.

This technique has disadvantages (for example, it relies on client programmers to refrain from fiddling with the data directly), but it's still "real" OO code梐fter all, a class is "a set of data, along with the functions that operate on that data." In this case, of necessity, the functions are all nonmembers, but they are still part of the interface of FILE.

Now consider an "obvious" way to rewrite Example 2 (a) in a language that does have classes.



//*** Example 2 (b) 


class FILE


{


public: 


  FILE( const char* filename,


        const char* mode );


~FILE();


  int  fseek( long offset, int origin );


  long ftell();


       /* etc. */


private:


  /*...data goes here...*/


};

The FILE* parameters have just become implicit this parameters. Here it's clear that fseek is part of FILE, just as it was in Example 2 (a), even though there it was a nonmember. We can even merrily make some functions members and some not.



//*** Example 2 (c) 


class FILE


{


public:


  FILE( const char* filename,


        const char* mode );


~FILE();


  long ftell();


       /* etc. */


private:


  /*...data goes here...*/


};


int fseek( FILE* stream,


           long  offset,


           int   origin );

It really doesn't matter whether the functions are members. As long as they "mention" FILE and are "supplied with" FILE, they really are part of FILE. In Example 2 (a), all the functions were nonmembers, because in C they have to be. Even in C++, some functions in a class's interface have to be (or should be) nonmembers. operator<< can't be a member, because it requires a stream as the left-hand argument, and operator+ shouldn't be a member in order to allow conversions on the left-hand argument.

A Deeper Look at Koenig Lookup

The Interface Principle makes even more sense when you realize that it does exactly the same thing as Koenig lookup. I'll use two examples to illustrate and define Koenig lookup. In the next section, I'll use the Myers Example to show why this is directly related to the Interface Principle.

Here's why we need Koenig lookup, using the same example as in Item 31. It's right out of the C++ standard.



//*** Example 3 (a) 


namespace NS


{


  class T { };


  void f(T);


}


NS::T parm;


int main()


{


  f(parm);    // OK: calls NS::f


}

Pretty nifty, isn't it? "Obviously" the programmer shouldn't have to explicitly write NS::f(parm), because just f(parm) "obviously" means NS::f(parm), right? But what's obvious to us isn't always obvious to a compiler, especially considering there's nary a "using" in sight to bring the name f into scope. Koenig lookup lets the compiler do the right thing.

Here's how it works: Recall that "name lookup" just means that, whenever you write a call like "f(parm)", the compiler has to figure out which function named f you want. (With overloading and scoping, there could be several functions named f.) Koenig lookup says that, if you supply a function argument of class type (here parm, of type NS::T), then to find the function name the compiler is required to look, not just in the usual places, such as the local scope, but also in the namespace (here, NS) that contains the argument's type.^[5] So Example 3 (a) works: The parameter being passed to f is a T, T is defined in namespace NS, and the compiler can consider the f in namespace NS梟o fuss, no muss.

^[5] There's a little more to the mechanics, but that's essentially it.

It's good that we don't have to explicitly qualify f, because sometimes we can't easily qualify a function name.



//*** Example 3 (b) 


#include <iostream>


#include <string>  // this header declares the function


                   //   std::operator<< for strings


int main()


{


  std::string hello = "Hello, world";


  std::cout << hello;   // OK: calls std::operator<<


}

Here the compiler has no way to find operator<< without Koenig lookup, because the operator<< we want is a free function that's made known to us only as part of the string package. It would be disgraceful if the programmer were forced to qualify this function name, because then the last line couldn't use the operator naturally. Instead, we would have to write either "std::operator<<( std::cout, hello );" which is exceedingly ugly, or "using std::operator<<;" which is annoying and quickly becomes tedious if there are many operators, or "using namespace std;" which dumps all the names in std into the current namespace and thus eliminates much of the advantage of having namespaces in the first place. If those options send shivers down your spine, you understand why we need Koenig lookup.

In summary, if in the same namespace you supply a class and a free function that mentions that class,^[6] the compiler will enforce a strong relationship between the two.^[7] And that brings us back to the Interface Principle, because of the Myers Example.

^[6] By value, reference, pointer, or whatever.

^[7] Granted, that relationship is still less strong than the relationship between a class and one of its member functions. See "How Strong Is the 'Part of' Relationship?" later in this Item.

More Koenig Lookup: The Myers Example

Consider first a (slightly) simplified example.



//*** Example 4 (a) 


namespace NS   // this part is


{             // typically from


  class T { }; // some header


}             // file T.h


void f( NS::T );


int main()


{


  NS::T parm;


  f(parm);     // OK: calls global f


}

Namespace NS supplies a type T, and the outside code provides a global function f that happens to take a T. This is fine, the sky is blue, the world is at peace, and everything is wonderful.

Time passes. One fine day, the author of NS helpfully adds a function:



//*** Example 4 (b) 


namespace NS   // typically from


{             // some header T.h


  class T { };


  void f( T ); // <-- new function


}


void f( NS::T );


int main()


{


  NS::T parm;


  f(parm);     // ambiguous: NS::f


}             //   or global f?

Adding a function in a namespace scope "broke" code outside the namespace, even though the client code didn't write using to bring NS's names into its scope! But wait, it gets better桸athan Myers pointed out the following interesting behavior with namespaces and Koenig lookup.



//*** The Myers Example: "Before" 


namespace A


{


  class X { };


}


namespace B


{


  void f( A::X );


  void g( A::X parm )


  {


    f(parm);   // OK: calls B::f


  }


}

This is fine, the sky is blue. One fine day, the author of A helpfully adds another function.



//*** The Myers Example: "After" 


namespace A


{


  class X { };


  void f( X ); // <-- new function


}


namespace B


{


  void f( A::X );


  void g( A::X parm )


  {


    f(parm);   // ambiguous: A::f or B::f?


  }


}

"Huh?" you might ask. "The whole point of namespaces is to prevent name collisions, isn't it? But adding a function in one namespace actually seems to 'break' code in a completely separate namespace." True, namespace B's code seems to break merely because it mentions a type from A. B's code didn't write a using namespace A; anywhere. It didn't even write using A::X;.

This is not a problem, and B is not broken. This is in fact exactly what should happen.^[8] If there's a function f(X) in the same namespace as X, then, according to the Interface Principle, f is part of the interface of X. It doesn't matter a whit that f happens to be a free function; to see clearly that it's nonetheless logically part of X, just give it another name.

^[8] This very example arose at the Morristown meeting in November 1997, and it's what got me thinking about this issue of membership and dependencies.



//*** Restating the Myers Example: "After" 


namespace A


{


  class X { };


  ostream& operator<<( ostream&, const X& );


}


namespace B


{


  ostream& operator<<( ostream&, const A::X& );


  void g( A::X parm )


  {


    cout << parm; // ambiguous: A::operator<< or


  }              //   B::operator<<?


}

How Strong Is the "Part of" Relationship?

While the Interface Principle states that both member and nonmember functions can be logically "part of " a class, it doesn't claim that members and nonmembers are equivalent. For example, member functions automatically have full access to class internals, whereas nonmembers have such access only if they're made friends. Likewise for name lookup, including Koenig lookup, the C++ language deliberately says that a member function is to be considered more strongly related to a class than a nonmember.

//*** NOT the Myers Example namespace A { class X { }; void f( X ); } class B // <-- class, not namespace { void f( A::X ); void g( A::X parm ) { f(parm); // OK: B::f, not ambiguous } };

Now that we're talking about a class B, rather than a namespace B, there's no ambiguity. When the compiler finds a member named f(), it won't bother trying to use Koenig lookup to find free functions.

So in two major ways梐ccess rules and lookup rules梕ven when a function is "part of " a class, according to the Interface Principle, a member is more strongly related to the class than a nonmember.

If client code supplies a function that mentions X and matches the signature of one provided in the same namespace as X, the call should be ambiguous. B should be forced to say which competing function it means, its own or that supplied with X. This is exactly what we should expect given the Interface Principle:

For a class X, all functions, including free functions, that both

"Mention" X

Are "supplied with" X

are logically part of X, because they form part of the interface of X.

What the Myers Example means is simply that namespaces aren't quite as independent as people originally thought, but they are still pretty independent and they fit their intended uses.

In short, it's no accident that the Interface Principle works exactly the same way as Koenig lookup. Koenig lookup works the way it does fundamentally because of the Interface Principle.

"How Strong Is the 'Part of' Relationship?" shows why a member function is still more strongly related to a class than a nonmember.

I l@ve RuBoard