This section covers the essential inheritance-related syntax that you need to understand in order to create classes that inherit from other classes.
The syntax for declaring that a class inherits from another class is as follows:
class DerivedClass : BaseClass { ... }
The derived class inherits from the base class. Unlike other languages such as C++, in C# a class is allowed to derive from, at most, one other class; a class is not allowed to derive from two or more classes. However, unless DerivedClass is declared as sealed (see the section titled “Sealed Classes” later in this chapter), you can create further derived classes that inherit from DerivedClass using the same syntax:
class DerivedSubClass : DerivedClass { ... }
In this way, you can create inheritence hierarchies.
Suppose you are writing a syntax analyzer as part of a compiler. You need to define a class that represents each of the elements of a program according to the syntax rules of the language. Such elements are sometimes referred to as tokens. You could declare the Token class as shown below. The constructor builds a Token object from the string of characters passed in (which could be a keyword, an identifier, a piece of white space, or any other valid piece of syntax for the language being parsed):
class Token { public Token(string name) { ... } ... }
You could then define classes for each different classification (type) of token, based on the Token class, adding additional methods as necessary. For example:
class IdentifierToken : Token { ... }
Remember that the System.Object class is the root class of all classes. All classes implicitly derive from the System.Object class. For example, if you implement the Token class like this:
class Token { public Token(string name) { ... } ... }
The compiler silently rewrites it as the following code (which you can write explicitly if you really want to):
class Token : System.Object { public Token(string name) { ... } ... }
What this means in practical terms is that all classes that you define automatically inherit all the features of the System.Object class. This includes methods such as ToString (first discussed in Chapter 2, “Working with Variables, Operators, and Expressions”), which is used to convert an object to a string.
All classes have at least one constructor. (Remember that if you don't provide one, the compiler generates a default constructor for you.) A derived class automatically contains all fields from the base class. These fields require initialization when an object is created. Therefore, the constructor for a derived class must call the constructor for its base class. You use the base keyword to call a base class constructor when the constructor is defined. Here's an example:
class IdentifierToken : Token { public IdentifierToken(string name) : base(name) // calls Token(name) { ... } ... }
If you don't explicitly call a base class constructor in a derived class constructor, the compiler attempts to silently insert a call to the base class's default constructor. Taking the earlier example, the compiler will rewrite this:
class Token { public Token(string name) { ... } ... }
As this:
class Token : System.Object { public Token(string name) : base() { ... } ... }
This works because System.Object has a public default constructor. However, not all classes have a public default constructor, in which case forgetting to call the base class constructor results in a compile-time error, because the compiler must insert a call to a constructor, but does not know which one to use. For example:
class IdentifierToken : Token { public IdentifierToken(string name) // error, base class Token does not have // a public default constructor { ... } ... }
In previous examples in this book, you have seen how to declare a variable using a class type, and then use the new keyword to create an object. You have also seen how the type-checking rules of C# prevent you from assigning a variable of one type to an object instantiated from a different type. For example, given the definitions of the Token, IdentifierToken, and KeywordToken classes, the following code is illegal:
class Token { ... }
class IdentifierToken : Token { ... } class KeywordToken : Token { ... }
... IdentifierToken it = new IdentifierToken(); KeywordToken kt = it; // error – different types
However, it is possible to refer to an object from a variable of a different type as long as the type used is a class that is higher up the inheritance hierarchy. So the following statements are legal:
IdentierToken it = new IdentifierToken(); Token t = it; // legal as Token is a base class of IdentifierToken
The inheritance hierarchy means that you can think of an IdentifierToken as a special type of Token (it has everything that a Token has), with a few extra bits (defined by any methods and fields you add to the IdentifierToken class). You can also make a Token variable refer to a KeywordToken object as well. There is one significant limitation, however—when referring to a KeywordToken or IdentifierToken object using a Token variable, you can only access methods and fields that are defined by the Token class. Any additional methods defined by the KeywordToken or IdentifierToken classes are available only when using a KeywordToken or IdentifierToken variable.
One of the hardest problems in the realm of computer programming is the task of thinking up unique and meaningful names for identifiers. If you are defining a method for a class, and that class is part of an inheritance hierarchy, sooner or later you are going to try to reuse a name that is already in use by one of the classes higher up the hiererachy. If a base class and a derived class happen to declare two methods that have the same signature (the method signature is the name of the method and the number and types of its parameters), you will receive a warning when you compile the application. The method in the derived class masks (or hides) the method in the base class that has the same signature. For example, if you compile the following code, the compiler will generate a warning message telling you that IdentifierToken.Name hides the inherited method Token.Name:
class Token { ... public string Name() { ... } } class IdentifierToken : Token { ... public string Name() { ... } }
Although your code will compile and run, you should take this warning seriously. If another class derives from IdentifierToken and calls the Name method, it might be expecting the method implemented in the Token class to be called. However, the Name method in the IdentifierToken class hides the Name method in the Token class, and it will be called instead. Most of the time, such a coincidence is at best a source of confusion, and you should consider renaming methods to avoid clashes. However, if you're sure that you want the two methods to have the same signature, you can silence the warning by using the new keyword as follows:
class Token { ... public string Name() { ... } } class IdentifierToken : Token { ... new public string Name() { ... } }
Using the new keyword like this does not change the fact that the two methods are completely unrelated and that hiding still occurs. It just turns the warning off. In effect, the new keyword says, “I know what I'm doing, so stop showing me these warnings.”
Sometimes you do want to hide the way in which a method is implemented in a base class. As an example, consider the ToString method in System.Object. The purpose of ToString is to convert an object to its string representation. Because this method is very useful, it is a member of System.Object, thereby automatically providing all classes with a ToString method. However, how does the version of ToString implemented by System.Object know how to convert a derived class into a string? A derived class might contain any number of fields with interesting values that should be part of the string. The answer is that the implementation of ToString in System.Object is actually a bit simplistic All it can do is convert an object into a string that represents its type, such as “Token” or “IdentifierToken”. This is not too useful after all. So, why provide a method that is so useless? The answer to this second question requires a bit of detailed thought.
Obviously, ToString is a fine idea in concept (all classes should provide a method that can be used to convert objects into strings). It is only the implementation that is problematic. In fact, you are not expected to call the ToString method defined by System.Object—it is simply a placeholder. Instead, you should provide your own version of the ToString method in each class you define, overriding the default implementation in System.Object. The version in System.Object is only there as a safety net, in case a class does not implement its own ToString method. In this way, you can be confident that you can call ToString on any object, and the method will return its contents as a string, regardless of how it is implemented.
A method that is intended to be overridden is called a virtual method. You should be clear on the difference between overriding a method and hiding a method. Overriding a method is a mechanism for providing different implementations of the same method—the methods are all related because they are intended to perform the same task. Hiding a method is a means of replacing one method with another—the methods are usually unrelated and might perform totally different tasks. Overriding a method is a useful programming concept; hiding a method is probably an error.
You can mark a method as a virtual method by using the virtual keyword. For example, the ToString method in the System.Object class is defined like this:
namespace System { class Object { public virtual string ToString() { ... } ... } ... }
If a base class declares that a method is virtual, a derived class can use the override keyword to declare another implementation of that method. For example:
class IdentifierToken : Token { ... public override string Name() { ... } }
There are some important rules you must follow when declaring polymorphic methods by using the virtual and override keywords:
You're not allowed to declare a private method by using the virtual or override keyword. If you try, you'll get a compile-time error. Private really is private.
The two methods must be identical. That is, they must have the same name, the same parameter types, and the same return type.
The two methods must have the same access. For example, if one of the two methods is public, the other must also be public. (C++ programmers should take note. In C++, the methods can have different accessibility.)
You can only override a virtual method. If the base class method is not virtual and you try to override it, you'll get a compile-time error. This is sensible; it should be up to the base class to decide whether its methods can be overridden.
If the derived class does not declare the method by using the override keyword, it does not override the base class method. In other words, it becomes an implementation of a completely different method that happens to have the same name. As before, this will cause a compile-time hiding warning, which you can silence by using the new keyword as previously described.
An override method is implicitly virtual and can itself be overridden in a further derived class. However, you are not allowed to explicitly declare that an override method is virtual by using the virtual keyword.
The public and private access keywords create two extremes of accessibility: public fields and methods of a class are accessible to everyone, whereas private fields and methods of a class are accessible to only the class itself.
These two extremes are sufficient when considering classes in isolation. However, as all experienced object-oriented programmers know, isolated classes cannot solve complex problems. Inheritance is a very powerful way of connecting classes, and there is clearly a very special and close relationship between a derived class and its base class. Frequently it is useful for a base class to allow derived classes to access some of its members, while hiding these same members from classes that are not part of the hierarchy. In this situation, you can use the protected keyword to tag members:
A derived class can access a protected base class member. In other words, inside the derived class, a protected base class member is effectively public.
If the class is not a derived class, it cannot access a protected class member. In other words, inside a class that is not a derived class, a protected class member is effectively private.
C# gives programmers complete freedom to declare both methods and fields as protected. However, most object-oriented guidelines recommend keeping your fields strictly private. Public fields violate encapsulation because all users of the class have direct, unrestricted access to the fields. Protected fields maintain encapsulation for users of a class, for whom the protected fields are inaccessible. However, protected fields still allow encapsulation to be violated by classes that inherit from the class.