Using Inheritance

This section covers the essential inheritance-related syntax that you need to understand in order to create classes that inherit from other classes.

Base Classes and Derived Classes

The syntax for declaring that a class inherits from another class is as follows:

class DerivedClass : BaseClass {
    ...
}

The derived class inherits from the base class. Unlike other languages such as C++, in C# a class is allowed to derive from, at most, one other class; a class is not allowed to derive from two or more classes. However, unless DerivedClass is declared as sealed (see the section titled “Sealed Classes” later in this chapter), you can create further derived classes that inherit from DerivedClass using the same syntax:

class DerivedSubClass : DerivedClass {
    ...
}

In this way, you can create inheritence hierarchies.

Suppose you are writing a syntax analyzer as part of a compiler. You need to define a class that represents each of the elements of a program according to the syntax rules of the language. Such elements are sometimes referred to as tokens. You could declare the Token class as shown below. The constructor builds a Token object from the string of characters passed in (which could be a keyword, an identifier, a piece of white space, or any other valid piece of syntax for the language being parsed):

class Token
{
    public Token(string name)
    {
        ...
    }
    ...
}

You could then define classes for each different classification (type) of token, based on the Token class, adding additional methods as necessary. For example:

class IdentifierToken : Token
{
    ...
}

TIP
C++ programmers should note that you do not and cannot explicitly specify whether the inheritance is public, private, or protected. C# inheritance is always implicitly public. Java programmers should note the use of the colon, and that there is no extends keyword.

Remember that the System.Object class is the root class of all classes. All classes implicitly derive from the System.Object class. For example, if you implement the Token class like this:

class Token
{
    public Token(string name)
    {
        ...
    }
    ...
}

The compiler silently rewrites it as the following code (which you can write explicitly if you really want to):

class Token : System.Object
{
    public Token(string name)
    {
        ...
    }
    ...
}

What this means in practical terms is that all classes that you define automatically inherit all the features of the System.Object class. This includes methods such as ToString (first discussed in Chapter 2, “Working with Variables, Operators, and Expressions”), which is used to convert an object to a string.

Calling Base Class Constructors

All classes have at least one constructor. (Remember that if you don't provide one, the compiler generates a default constructor for you.) A derived class automatically contains all fields from the base class. These fields require initialization when an object is created. Therefore, the constructor for a derived class must call the constructor for its base class. You use the base keyword to call a base class constructor when the constructor is defined. Here's an example:

class IdentifierToken : Token
{
    public IdentifierToken(string name)
           : base(name) // calls Token(name)
    {
        ...
    }
    ...
}

If you don't explicitly call a base class constructor in a derived class constructor, the compiler attempts to silently insert a call to the base class's default constructor. Taking the earlier example, the compiler will rewrite this:

class Token
{
    public Token(string name)
    {
        ...
    }
    ...
}

As this:

class Token : System.Object
{
    public Token(string name)
        : base()
    {
        ...
    }
    ...
}

This works because System.Object has a public default constructor. However, not all classes have a public default constructor, in which case forgetting to call the base class constructor results in a compile-time error, because the compiler must insert a call to a constructor, but does not know which one to use. For example:

class IdentifierToken : Token
{
    public IdentifierToken(string name)
    // error, base class Token does not have
    // a public default constructor
    {
        ...
    }
    ...
}

Assigning Classes

In previous examples in this book, you have seen how to declare a variable using a class type, and then use the new keyword to create an object. You have also seen how the type-checking rules of C# prevent you from assigning a variable of one type to an object instantiated from a different type. For example, given the definitions of the Token, IdentifierToken, and KeywordToken classes, the following code is illegal:

class Token
{
    ...
}

class IdentifierToken : Token
{
    ...
}

class KeywordToken : Token
{
    ...
}

...
IdentifierToken it = new IdentifierToken();
KeywordToken kt = it;  // error – different types

However, it is possible to refer to an object from a variable of a different type as long as the type used is a class that is higher up the inheritance hierarchy. So the following statements are legal:

IdentierToken it = new IdentifierToken();
Token t = it; // legal as Token is a base class of IdentifierToken

The inheritance hierarchy means that you can think of an IdentifierToken as a special type of Token (it has everything that a Token has), with a few extra bits (defined by any methods and fields you add to the IdentifierToken class). You can also make a Token variable refer to a KeywordToken object as well. There is one significant limitation, however—when referring to a KeywordToken or IdentifierToken object using a Token variable, you can only access methods and fields that are defined by the Token class. Any additional methods defined by the KeywordToken or IdentifierToken classes are available only when using a KeywordToken or IdentifierToken variable.

NOTE
This explains why you can assign almost anything to an object variable. Remember that object is an alias for System.Object, and all classes inherit from System.Object either directly or indirectly.

new Methods

One of the hardest problems in the realm of computer programming is the task of thinking up unique and meaningful names for identifiers. If you are defining a method for a class, and that class is part of an inheritance hierarchy, sooner or later you are going to try to reuse a name that is already in use by one of the classes higher up the hiererachy. If a base class and a derived class happen to declare two methods that have the same signature (the method signature is the name of the method and the number and types of its parameters), you will receive a warning when you compile the application. The method in the derived class masks (or hides) the method in the base class that has the same signature. For example, if you compile the following code, the compiler will generate a warning message telling you that IdentifierToken.Name hides the inherited method Token.Name:

class Token
{
    ...
    public string Name() { ... }
}

class IdentifierToken : Token
{
    ...
    public string Name() { ... }
}

Although your code will compile and run, you should take this warning seriously. If another class derives from IdentifierToken and calls the Name method, it might be expecting the method implemented in the Token class to be called. However, the Name method in the IdentifierToken class hides the Name method in the Token class, and it will be called instead. Most of the time, such a coincidence is at best a source of confusion, and you should consider renaming methods to avoid clashes. However, if you're sure that you want the two methods to have the same signature, you can silence the warning by using the new keyword as follows:

class Token
{
    ...
    public string Name() { ... }
}

class IdentifierToken : Token
{
    ...
    new public string Name() { ... }
}

Using the new keyword like this does not change the fact that the two methods are completely unrelated and that hiding still occurs. It just turns the warning off. In effect, the new keyword says, “I know what I'm doing, so stop showing me these warnings.”

Virtual Methods

Sometimes you do want to hide the way in which a method is implemented in a base class. As an example, consider the ToString method in System.Object. The purpose of ToString is to convert an object to its string representation. Because this method is very useful, it is a member of System.Object, thereby automatically providing all classes with a ToString method. However, how does the version of ToString implemented by System.Object know how to convert a derived class into a string? A derived class might contain any number of fields with interesting values that should be part of the string. The answer is that the implementation of ToString in System.Object is actually a bit simplistic All it can do is convert an object into a string that represents its type, such as “Token” or “IdentifierToken”. This is not too useful after all. So, why provide a method that is so useless? The answer to this second question requires a bit of detailed thought.

Obviously, ToString is a fine idea in concept (all classes should provide a method that can be used to convert objects into strings). It is only the implementation that is problematic. In fact, you are not expected to call the ToString method defined by System.Object—it is simply a placeholder. Instead, you should provide your own version of the ToString method in each class you define, overriding the default implementation in System.Object. The version in System.Object is only there as a safety net, in case a class does not implement its own ToString method. In this way, you can be confident that you can call ToString on any object, and the method will return its contents as a string, regardless of how it is implemented.

A method that is intended to be overridden is called a virtual method. You should be clear on the difference between overriding a method and hiding a method. Overriding a method is a mechanism for providing different implementations of the same method—the methods are all related because they are intended to perform the same task. Hiding a method is a means of replacing one method with another—the methods are usually unrelated and might perform totally different tasks. Overriding a method is a useful programming concept; hiding a method is probably an error.

You can mark a method as a virtual method by using the virtual keyword. For example, the ToString method in the System.Object class is defined like this:

namespace System
{
    class Object
    {
        public virtual string ToString()
        {
            ...
        }
        ...
    }
    ...
}

NOTE
A C# method is not virtual by default. This is the same as in C++, but a major difference from Java, in which all methods are virtual by default.

Virtual Methods and Polymorphism

Virtual methods allow you to call different versions of the same method, based on the type of the object determined dynamically by the runtime. Consider the following example classes that define a variation on the Mammal hierarchy described earlier:

class Mammal
{
    ...
    public virtual string GetTypeName()
    {
        return "This is a mammal";
    }
}

class Horse : Mammal
{
    ...
    public override string GetTypeName()
    {
        return "This is a horse";
    }
}

class Whale : Mammal
{
    ...
    public override string GetTypeName ()
    {
        return "This is a whale";
    }
}

class Kangaroo : Mammal
{
    ...
}

Notice two things: First, the override keyword used by the GetTypeName method (which will be described shortly) in the Horse and Whale classes, and second, that the Kangaroo class does not have a GetTypeName method.

Now examine the following block of code:

Mammal m;
Horse h = new Horse();
Whale w = new Whale();
Kangaroo k = new Kangaroo();

m = h;
Console.WriteLine(m.GetTypeName()); // Horse
m = w;
Console.WriteLine(m.GetTypeName()); // Whale
m = k;
Console.WriteLine(m.GetTypeName()); // Kangaroo

What will be output by the three different Console.WriteLine statements? At first glance, you would expect them all to print “This is a mammal,” because each statement calls the GetTypeName method on the m variable, which is a mammal. However, in the first case, you can see that m is actually a reference to a Horse (you are allowed to assign a Horse to a Mammal variable because the Horse class is derived from the Mammal class—all Horses are Mammals). Because the GetTypeName method is defined as virtual, the runtime works out that it should call the Horse. GetTypeName method, so the statement actually prints the message “This is a horse.” The same logic applies to the second Console.WriteLine statement, which outputs the message “This is a whale.” The third statement calls Console.WriteLine on a Kangaroo object. However, the Kangaroo class does not have a GetTypeName method, so the default method in the Mammal class is called, returning the string “This is a mammal.”

This phenomenon of the same statement invoking multiple methods is called polymorphism, which literally means “many forms.”

override Methods

If a base class declares that a method is virtual, a derived class can use the override keyword to declare another implementation of that method. For example:

class IdentifierToken : Token
{
    ...
    public override string Name() { ... }
}

There are some important rules you must follow when declaring polymorphic methods by using the virtual and override keywords:

You're not allowed to declare a private method by using the virtual or override keyword. If you try, you'll get a compile-time error. Private really is private.
The two methods must be identical. That is, they must have the same name, the same parameter types, and the same return type.
The two methods must have the same access. For example, if one of the two methods is public, the other must also be public. (C++ programmers should take note. In C++, the methods can have different accessibility.)
You can only override a virtual method. If the base class method is not virtual and you try to override it, you'll get a compile-time error. This is sensible; it should be up to the base class to decide whether its methods can be overridden.
If the derived class does not declare the method by using the override keyword, it does not override the base class method. In other words, it becomes an implementation of a completely different method that happens to have the same name. As before, this will cause a compile-time hiding warning, which you can silence by using the new keyword as previously described.
An override method is implicitly virtual and can itself be overridden in a further derived class. However, you are not allowed to explicitly declare that an override method is virtual by using the virtual keyword.

protected Access

The public and private access keywords create two extremes of accessibility: public fields and methods of a class are accessible to everyone, whereas private fields and methods of a class are accessible to only the class itself.

These two extremes are sufficient when considering classes in isolation. However, as all experienced object-oriented programmers know, isolated classes cannot solve complex problems. Inheritance is a very powerful way of connecting classes, and there is clearly a very special and close relationship between a derived class and its base class. Frequently it is useful for a base class to allow derived classes to access some of its members, while hiding these same members from classes that are not part of the hierarchy. In this situation, you can use the protected keyword to tag members:

A derived class can access a protected base class member. In other words, inside the derived class, a protected base class member is effectively public.
If the class is not a derived class, it cannot access a protected class member. In other words, inside a class that is not a derived class, a protected class member is effectively private.

C# gives programmers complete freedom to declare both methods and fields as protected. However, most object-oriented guidelines recommend keeping your fields strictly private. Public fields violate encapsulation because all users of the class have direct, unrestricted access to the fields. Protected fields maintain encapsulation for users of a class, for whom the protected fields are inaccessible. However, protected fields still allow encapsulation to be violated by classes that inherit from the class.

NOTE
You can access a protected base class member not only in a derived class, but also in classes derived from the derived class. A protected base class member retains its protected accessibility in a derived class and is accessible to further derived classes.