[ Team LiB ] Previous Section Next Section

7.5 Generating Hash Code

All object instances can provide a signed 32-bit integer hash of their contents via the GetHashCode( ) method on System.Object. Good hashes can have a dramatic effect on Hashtable speed (they are used to determine which bucket to add entries to in the hashtable), and can also provide a low-fidelity (but possibly more efficient) equivalence test. Using GetHashCode in this way is demonstrated in the following examples:

void Enroll(Student s, CourseList cl) {
  hashtable.Add(s, cl); // GHC called on key (s)
}
bool FastCompare(Student s1, Student s2) {
  // Use GHC to test for possible equivalence
  if (s1.GetHashCode( ) != s2.GetHashCode( )) return false;
  
  // Use Equals to test for definite equivalence
  return s1.Equals(s2);
}

The default implementation of GetHashCode( ) on System.Object returns a semi-unique member #, while the implementation of GetHashCode( ) on System.ValueType merely returns the hash of the first field in the value type. Although these defaults work in a lot of cases, there are sometimes performance benefits gained from implementing GetHashCode( ) on your own type. Additionally, if a type overrides the Equals( ) method, it is required to override the GetHashCode( ) method, which means that many framework types override GetHashCode( ), as shown here:

void DumpHashes(object o, int i, Version v) {
  Console.WriteLine(o.GetHashCode( )); // object index
  Console.WriteLine(i.GetHashCode( )); // integer value
  Console.WriteLine(v.GetHashCode( )); // hash of fields
}

7.5.1 The Object.GetHashCode Method

The System.GetHashCode method is declared as follows:

public virtual int GetHashCode( );

It is important to understand the general contract for GetHashCode( ), which looks like this:

  1. A.GetHashCode( ) even distribution across all a's

  2. a.Equals(b) a.GetHashCode( ) = = b.GetHashCode( )

  3. A overrides GetHashCode A overrides Equals

  4. Safe: no exceptions thrown; no circular references

The idea is that good implementations use all 32 bits to provide a good even distribution and ideally preserve the significance of field ordering (to ensure that Point(10,20) hashes differently to Point(20,10)). Preserving field ordering is traditionally done by multiplying the hash for each field by some odd prime constant (37 is a popular choice in the Java world, in which a similar construct exists). Additionally, if you have not derived directly from System.Object, consider combining the hash of your base type with the hash of your contained members. Lastly, remember that contained aggregate data structures (such as Arrays) may not hash their contents correctly, and therefore may need to be hashed by hand. The following is an example implementing some of these rules, in order to provide a good hash distribution that includes all the type's data in the hash calculation.

public sealed class Data {
  public readonly short x, y;
  public readonly Color c;
  ArrayList al = new ArrayList( );
  
  public override int GetHashCode( ) {
    int hc = 1; // base.GetHashCode if base!=object
    hc = 37*hc + x<<16|(ushort)x;
    hc = 37*hc + y.GetHashCode( );
    hc = 37*hc + (c=  =null ? 0 : c.GetHashCode( ));
    foreach (object o in al)
      hc = 37*hc + o.GetHashCode( );
    return hc;
  }

}
    [ Team LiB ] Previous Section Next Section