7.5 Generating Hash Code
All object instances can provide a signed
32-bit integer hash of their contents via the GetHashCode(
) method
on System.Object. Good
hashes can have a dramatic effect on Hashtable
speed (they are used to determine which bucket to add entries to in
the hashtable), and can also provide a low-fidelity (but possibly
more efficient) equivalence test. Using
GetHashCode in this way is demonstrated in the
following examples:
void Enroll(Student s, CourseList cl) {
hashtable.Add(s, cl); // GHC called on key (s)
}
bool FastCompare(Student s1, Student s2) {
// Use GHC to test for possible equivalence
if (s1.GetHashCode( ) != s2.GetHashCode( )) return false;
// Use Equals to test for definite equivalence
return s1.Equals(s2);
}
The default implementation of GetHashCode( ) on
System.Object returns a semi-unique member #,
while the implementation of GetHashCode( ) on
System.ValueType merely returns the hash of the
first field in the value type. Although these defaults work in a lot
of cases, there are sometimes performance benefits gained from
implementing GetHashCode( ) on your own type.
Additionally, if a type overrides the Equals( )
method, it is required to override the GetHashCode(
) method, which means that many framework types override
GetHashCode( ), as shown here:
void DumpHashes(object o, int i, Version v) {
Console.WriteLine(o.GetHashCode( )); // object index
Console.WriteLine(i.GetHashCode( )); // integer value
Console.WriteLine(v.GetHashCode( )); // hash of fields
}
7.5.1 The Object.GetHashCode Method
The System.GetHashCode method is
declared as follows:
public virtual int GetHashCode( );
It is important to understand the general contract for
GetHashCode( ), which looks like this:
A.GetHashCode( ) even
distribution across all a's a.Equals(b)
a.GetHashCode( ) = = b.GetHashCode( ) A overrides GetHashCode
A overrides
Equals Safe: no exceptions thrown; no circular references
The idea is that good implementations use all 32 bits to provide a
good even distribution and ideally preserve the significance of field
ordering (to ensure that Point(10,20) hashes
differently to Point(20,10)). Preserving field
ordering is traditionally done by multiplying the hash for each field
by some odd prime constant (37 is a popular choice in the Java world,
in which a similar construct exists). Additionally, if you have not
derived directly from System.Object, consider
combining the hash of your base type with the hash of your contained
members. Lastly, remember that contained aggregate data structures
(such as Arrays) may not hash their contents
correctly, and therefore may need to be hashed by hand. The following
is an example implementing some of these rules, in order to provide a
good hash distribution that includes all the type's
data in the hash calculation.
public sealed class Data {
public readonly short x, y;
public readonly Color c;
ArrayList al = new ArrayList( );
public override int GetHashCode( ) {
int hc = 1; // base.GetHashCode if base!=object
hc = 37*hc + x<<16|(ushort)x;
hc = 37*hc + y.GetHashCode( );
hc = 37*hc + (c= =null ? 0 : c.GetHashCode( ));
foreach (object o in al)
hc = 37*hc + o.GetHashCode( );
return hc;
}
}
|