class.

5.9 The Mechanism of Polymorphism

The fruit example, as written, assumes a domain where apples, oranges, and bananas are used separately. They just happen to have a common abstraction (fruit). Let us consider a domain where someone would like to walk up to a fruit, not knowing what type of fruit it happens to be, and tell it to compute its cost. The fruit basket object/class shown in Figure 5.16 is an example where this type of operation might be applicable.

Figure 5.16. The `FruitBasket` class.

graphics/05fig16.gif

For some of the operations on the fruit basket, it is convenient to think of a fruit basket as a list of fruit. Operations like how_many() do not care what types of fruits have been placed in the fruit basket. Other operations, such as cost, need to know the explicit type of fruit in the fruit basket since apples cost 50 cents, bananas cost 40 cents, and oranges cost 60 cents. In action-oriented programming, this type of construct is usually implemented as a structure-embedded union (in C) or variant record (in Pascal) (see Figure 5.17).

Figure 5.17. The `FruitBasket` class.

graphics/05fig17.gif

A design that uses variant records to implement the fruit basket inevitably requires explicit case analysis (e.g., a case statement, nested if-then-else statements) to determine which type was stored in the variant record. For example, the cost function for the variant record fruit would look like the following pseudocode:


Function fruit_cost( fruit f) 

   perform case analysis on the type field of f

       case Apple:  return 50

       case Banana: return 40

       case Orange:  return 60

End

The problem with explicit case analysis is that if we decide to add a new type of fruit, we need to add a new case to the case statement. When we modify existing code, we risk introducing new bugs into that code. Watch out for the designer who claims that nothing can go wrong since there is only one added case statement. In reality, there is never one added case statement; there are usually many of them sprinkled throughout the code. The probability of forgetting to add the appropriate case to one of them is high [10].

The object-oriented paradigm solves this problem through a mechanism known as polymorphism or dynamic binding. The idea behind polymorphism is that a group of heterogeneous objects (e.g., apples, oranges, bananas) can be made to look homogeneous (e.g., a bunch of fruit), but can then be distinguished based on their own specific type at runtime. If you do not mind believing in magic, then imagine the ability to walk up to any fruit and tell it to give you its cost. You know the object to which you sent the cost message is not a fruit object because there is no such thing as a fruit object (fruit is an abstract class). Therefore, it is an apple, or a banana, or an orange masquerading as a fruit (see Figure 5.18). When the message is sent, the system figures out (magic) which type of fruit you are actually sending the message to and calls the appropriate method, i.e., the cost method of Apple, or the cost method of Banana, or the cost method of Orange.

Figure 5.18. The `FruitBasket` and polymorphism.

graphics/05fig18.gif

If you do not like believing in magic, there is an explanation as to how polymorphism is implemented. In the object-oriented community, you may hear people talk about true polymorphism versus polymorphism. The argument here revolves around the implementation of polymorphism. Interpreted languages generally make all message sends polymorphic by definition of the language (e.g., SmallTalk). The typical implementation is for the system to ask the object for its type at the time the message is sent (see Figure 5.19). The object gives the system a string with the name of its type, e.g., "Apple." The system hashes this string with the string containing the name of the message, e.g., "cost." This gives the system an index into a hash table where the address of the cost method for the apple class lives. It then jumps to this address, which completes the message send. This is often called true polymorphism.

Figure 5.19. The implementation of polymorphism (SmallTalk).

graphics/05fig19.gif

Compiled languages such as C++, which are concerned with higher levels of efficiency, allow the implementors of a class to decide which functions should be polymorphic (or dynamically bound) and which should be monomorphic (or statically bound). These languages typically build jump tables for each class containing poly-morphic functions. As is illustrated in Figure 5.20, a hidden pointer is added to each object, which points at its particular jump table. The address of this jump table maintains the particular type of the base class we are examining, e.g., apple objects point at the apple jump table even if they are being viewed as fruit. At runtime, when a derived class is built, its constructor points the hidden pointer at the appropriate jump table. In order to execute a polymorphic function call, the system need only jump into the jump table with the constant offset defined by the name of the message. No hashing of strings or hash table problems (e.g., collisions) need to be addressed at runtime.

Figure 5.20. The implementation of polymorphism (C++).

graphics/05fig20.jpg

There is another refinement to polymorphism, called pure polymorphism. A pure polymorphic function has no default definition in the base class. Derived classes that inherit a pure polymorphic function are said to be inheriting for interface, that is, they are inheriting messages, not methods. In the fruit example, we state that the print message for the Fruit class has a good default method. The best a fruit can do is print, "Hi, I'm a .3-pound red fruit." If a derived class does not define its own print function, then it can at least get this default through inheritance. On the other hand, the cost function has no good default. The Fruit class wants to state that all derived classes of Fruit must define a cost method, but it has no clue as to what the algorithm should be. This is the role of pure polymorphism. We say that print for fruit is polymorphic, and cost for fruit is pure polymorphic. In C++, an abstract base class is implemented either as a class that has one or more pure polymorphic functions and/or as a class that has no public constructor.

Heuristic 5.9

If two or more classes share only common data (no common behavior), then that common data should be placed in a class that will be contained by each sharing class.

Heuristic 5.10

If two or more classes have common data and behavior (i.e., methods), then those classes should each inherit from a common base class that captures those data and methods.

Heuristic 5.11

If two or more classes share only a common interface (i.e., messages, not methods), then they should inherit from a common base class only if they will be used polymorphically.

The last heuristic refers to the fact that inheritance for interface does not buy you anything unless the derived objects will be required to perform a runtime type discrimination. If all valves turn on and turn off, but each valve does it differently, then the base class "valve" is not useful unless there exists some place in the application where a generic valve will need to decide which it is at runtime. If water valves are always used in one place, and oil valves are always used in another place, then runtime-type discrimination is unnecessary and the base class "valve" adds no interesting perspectives to the design. Of course, a designer is free to argue that the objects are likely to be polymorphic in the future, thereby justifying the existence of the base class through extensibility concerns.

Heuristic 5.12

Explicit case analysis on the type of an object is usually an error. The designer should use polymorphism in most of these cases.

If a designer creates a method that states, "If you are of type1, do this; else if you are of type2, do this; else if are of type3, do this, else if …," he or she is making a mistake. The better approach is to have all of the types involved in the explicit case analysis inherit from a common class. This common class defines a pure polymorphic function called do_this, and each type can write its own do_this method. Now the method the designer creates need only send the do_this message to the object in question. The polymorphism mechanism can perform the case analysis implicitly, eliminating the need to modify existing code when a new type is added to the system.

< Free Open Study >