2.4 Value Types and Reference Types

All C# types fall into the following categories:

Value types (struct, enum)
Reference types (class, array, delegate, interface)
Pointer types

The fundamental difference between the three main categories (value types, reference types, and pointer types) is how they are handled in memory. The following sections explain the essential differences between value types and reference types. Pointer types fall outside mainstream C# usage, and are covered later in Chapter 4.

2.4.1 Value Types

Value types are the easiest types to understand. They directly contain data, such as the int type (holds an integer), or the bool type (holds a true or false value). A value type's key characteristic is when you assign one value to another, you make a copy of that value. For example:

using System;
class Test {
  static void Main ( ) {
    int x = 3;
    int y = x; // assign x to y, y is now a copy of x
    x++; // increment x to 4
    Console.WriteLine (y); // prints 3
  }
}

2.4.2 Reference Types

Reference types are a little more complex. A reference type really defines two separate entities: an object, and a reference to that object. This example follows exactly the same pattern as our previous example, but notice how the variable y is updated, while in our previous example, y remained unchanged:

using System;
using System.Text;
class Test {
  static void Main ( ) {
    StringBuilder x = new StringBuilder ("hello");
    StringBuilder y = x;
    x.Append (" there");
    Console.WriteLine (y); // prints "hello there"
  }
}

This is because the StringBuilder type is a reference type, while the int type is a value type. When we declared the StringBuilder variable, we were actually doing two separate things, which can be separated into these two lines:

StringBuilder x;
x = new StringBuilder ("hello");

The first line creates a new reference to a StringBuilder. The second line assigns a new StringBuilder object to the reference. Let's look at the next line:

StringBuilder y = x;

When we assign x to y, we are saying, "make y point to the same thing that x points to." A reference stores the address of an object (an address is a memory location, stored as a four-byte number). We're actually still making a copy of x, but we're copying this four-byte number as opposed to the StringBuilder object itself.

Let's look at this line:

x.Append (" there");

This line actually does two things. It first finds the memory location represented by x, and then it tells the StringBuilder object that lies at that memory location to append " there" to it. We could have achieved exactly the same effect by appending " there" to y, because x and y refer to the same object:

y.Append (" there");

A reference may point at no object, by assigning the reference to null. In this code sample, we assign null to x, but can still access the same StringBuilder object we created via y:

using System;
using System.Text;
class Test {
  static void Main ( ) {
    StringBuilder x;
    x = new StringBuilder ("hello");
    StringBuilder y = x;
    x = null;
    y.Append (" there");
    Console.WriteLine (y); // prints "hello there"
  }
}

2.4.3 The Heap and the Stack

The stack is a block of memory that grows each time a function is entered (basically to store local variables) and shrinks each time a function exits (because the local variables are no longer needed). In our previous example, when the main function finishes executing, the references x and y go out of scope, as do any value types declared in the function. This is because these values are stored on the stack.

The heap is a block of memory in which reference type objects are stored. Whenever a new object is created, it is allocated on the heap, and returns a reference to that object. During a program's execution, the heap starts filling up as new objects are created. The runtime has a garbage collector that deallocates objects from the heap so your computer does not run out of memory. An object is deallocated when it is determined that it has zero references to it.

You can't explicitly delete objects in C#. An object is either automatically popped off the stack or automatically collected by the garbage collector.

2.4.3.1 Value types and reference types side-by-side

A good way to understand the difference between value types and reference types is to see them side-by-side. In C#, you can define your own reference types or your own value types. If you want to define a simple type such as a number, it makes sense to define a value type, in which efficiency and copy-by-value semantics are desirable. Otherwise you should define a reference type. You can define a new value type by declaring a struct, and define a new reference type by defining a class.

To create a value-type or reference-type instance, the constructor for the type may be called, with the new keyword. A value-type constructor simply initializes an object. A reference-type constructor creates a new object on the heap, and then initializes the object:

// Reference-type declaration
class PointR {
  public int x, y;
}
// Value-type declaration
struct PointV {
  public int x, y;
}
class Test {
  static void Main( ) {
    PointR a; // Local reference-type variable, uses 4 bytes of
              // memory on the stack to hold address
    PointV b; // Local value-type variable, uses 8 bytes of
              // memory on the stack for x and y
    a = new PointR( ); // Assigns the reference to address of new
                      // instance of PointR allocated on the
                      // heap. The object on the heap uses 8
                      // bytes of memory for x and y, and an
                      // additional 8 bytes for core object
                      // requirements, such as storing the 
                      // object's type  synchronization state
    b = new PointV( ); // Calls the value-type's default
                      // constructor.  The default constructor 
                      // for both PointR and PointV will set 
                      // each field to its default value, which 
                      // will be 0 for both x and y.
    a.x = 7;
    b.x = 7;
  }
}
// At the end of the method the local variables a and b go out of
// scope, but the new instance of a PointR remains in memory until
// the garbage collector determines it is no longer referenced

Assignment to a reference type copies an object reference, while assignment to a value type copies an object value:

    ...
    PointR c = a;
    PointV d = b;
    c.x = 9;
    d.x = 9;
    Console.WriteLine(a.x); // Prints 9
    Console.WriteLine(b.x); // Prints 7
  }
}

As with this example, an object on the heap can be pointed at by multiple variables, whereas an object on the stack or inline can only be accessed via the variable it was declared with. "Inline" means that the variable is part of a larger object; i.e., it exists as a data member or an array member.

2.4.4 Type System Unification

C# provides a unified type system, whereby the object class is the ultimate base type for both reference types and value types. This means all types, apart from the occasionally used pointer types, share the same basic set of characteristics.

2.4.4.1 Simple types are value types

Simple types are so called because most have a direct representation in machine code. For example, the floating-point numbers in C# are matched by the floating-point numbers in most processors, such as Pentium processors. For this reason, most languages treat them specially, but in doing so create two separate sets of rules for simple types and user-defined types. In C#, all types follow the same set of rules, resulting in greater programming simplicity.

To do this, the simple types in C# alias structs found in the System namespace. For instance, the int type aliases the System.Int32 struct, the long type aliases the System.Int64 struct, etc. Simple types therefore have the same features one would expect any user-defined type to have. For instance, the int type has function members:

int i = 3;
string s = i.ToString( );

This is equivalent to:

// This is an explanatory version of System.Int32
namespace System {
  struct Int32 {
    ...
    public string ToString( ) {
      return ...;
    }
  }
}
// This is valid code, but we recommend you use the int alias
System.Int32 i = 5;
string s = i.ToString( );

2.4.4.2 Value types expand the set of simple types

Simple types have two useful features: they are efficient, and their copy-by-value semantics are intuitive. Consider again how natural it is assigning one number to another and getting a copy of the value of that number, as opposed to getting a copy of a reference to that number. In C#, value types are defined to expand the set of simple types. In this example, we revisit our PointV and PointR example, but this time look at efficiency.

Creating an array of 1,000 ints is very efficient. This allocates 1,000 ints in one contiguous block of memory:

int[ ] a = new int[1000];

Similarly, creating an array of a value type PointV is very efficient too:

struct PointV {
  public int x, y
}
PointV[ ] a = new PointV[1000];

If we used a reference type PointR, we would need to instantiate 1,000 individual points after instantiating the array:

class PointR {
   public int x, y;
}
PointR[ ] a = new PointR[1000]; // creates an array of 1000 null references
for (int i=0; i<a.Length; i++)
   a[i] = new PointR( );

In Java, only the simple types (int, float, etc.) can be treated with this efficiency, while in C# one can expand the set of simple types by declaring a struct.

Furthermore, C#'s operators may be overloaded, so that operations that are typically applicable only to simple types are applicable to any class or struct, such as +, -, etc. (see Section 4.5, in Chapter 4).

2.4.4.3 Boxing and unboxing value types

So that common operations can be performed on both reference types and value types, each value type has a corresponding hidden reference type. This is created when it is cast to a reference type. This process is called boxing. A value type may be cast to the "object" class, which is the ultimate base class for all value types and reference types, or an interface it implements.

In this example, we box and unbox an int to and from its corresponding reference type:

class Test {
  static void Main ( ) {
    int x = 9;
    object o = x; // box the int
    int y = (int)o; // unbox the int 
  }
}

When a value type is boxed, a new reference type is created to hold a copy of the value type. Unboxing copies the value from the reference type back into a value type. Unboxing requires an explicit cast, and a check is made to ensure the value type to convert to matches the type contained in the reference type. An InvalidCastException is thrown if the check fails. You never need to worry about what happens to boxed objects once you've finished with them; the garbage collector take cares of them for you.

Using collection classes is a good example of boxing and unboxing. In this example, we use the Queue class with value types:

using System;
using System.Collections;
class Test {
  static void Main ( ) {
    Queue q = new Queue ( );
    q.Enqueue (1); // box an int
    q.Enqueue (2); // box an int
    Console.WriteLine ((int)q.Dequeue( )); // unbox an int
    Console.WriteLine ((int)q.Dequeue( )); // unbox an int
  }
}

[ Team LiB ]