This part will look at what GetHashCode() does. Quick summary first, we have this class
public class Employee : IEquatable<employee> { private string employeeName; private int employeeNumber; public Employee(string employeeName, int employeeNumber) { this.employeeName = employeeName; this.employeeNumber = employeeNumber; } public override bool Equals(object obj) { if (obj == null) return false; Employee other = obj as Employee; if (other == null) return false; if (this.employeeNumber == other.employeeNumber) return true; else return false; } public bool Equals(Employee obj) { if (obj == null) return false; if (this.employeeNumber == obj.employeeNumber) return true; else return false; } }As we have overridden the Equals method (from System.Object) we are now getting a compilation working about GetHashCode()
"'Employee' overrides Object.Equals(object o) but does not override Object.GetHashCode()"Anytime that we override Equals we should override GetHashCode(). But what does GetHashCode() do. Let's create a test - we aren't actually going to put a test within it, but do some "debugging".
[TestMethod] public void MyTestMethod() { TestContext.WriteLine("HashCode andrew {0}", andrew.GetHashCode()); TestContext.WriteLine("HashCode rhona {0}", rhona.GetHashCode()); TestContext.WriteLine("HashCode rhonda {0}", rhonda.GetHashCode()); }We are using the TestContext to get some output. Run just this test, then click on the Test in the Test Results. You should see
The entries for all three object have a different "Hash number". The hash number is simply an integer. Before looking at the GetHashCode() method let's create another test using a HashSet. A HashSet contains a set of values that contain no duplicates. Firstly, let's create a HashSet of integers, where we add the same number twice. Our expectation will be that it will not be added to the HashSet a second time. So with a test we have
[TestMethod] public void HashSetofIntegerTest() { HashSet<int> numbers = new HashSet<int>(); numbers.Add(1); numbers.Add(2); numbers.Add(3); numbers.Add(1); Assert.AreEqual(3, numbers.Count); }And when run it passes - despite making four calls to Add we only three elements. We don't get an error when we add - although the Add method returns a bool which when true is returned the element has added (or false if it is already in the set). Now let's do this with a HashSet<Employee>
[TestMethod] public void HashSetOfEmployeeTest() { HashSet<employee> workers = new HashSet<employee>(); workers.Add(andrew); workers.Add(rhonda); workers.Add(rhona); Assert.AreEqual(2, workers.Count); }Running this test fails - all three employees have been added. However, we want only two people in it (as rhonda and rhona should be the the same person). If you step through the add methods it will just step to the next Add statement. The Add method is not calling our Equals method - when adding to our HashCode that only should have unique entries is not checking what is in the HashSet. Or is it? Now add a GetHashCode method (type public override and then select GetHashCode()). But leave the method with the default one created
public override int GetHashCode() { return base.GetHashCode(); }Now when you step through this the first thing (in the Employee class) that it does is call GetHashCode(). In the window for Call Stack - right click and you can choose "Show External Code". This will show the stack when the Add method is run, showing a call to a method "AddIfNotPresent" which then retrieves the HashCode.
So each time Add is being called GetHashCode is executed - but our Equals (methods) have not being executed. A Hash code is a numeric value which can be used to determine if two objects are not the same - but is can't be used to tell if two items are the same. Hash functions can be used for indexes. The rules are
- Two objects that are the same should return the same hash code
- Two objects that return the same hash code are not necessarily the same object
- GetHashCode() should be consistent and return the same hash code for the same data
public override int GetHashCode() { return 1; }Now run our test of the HashSet - it passes. There are only two objects in it. Now if you step through the code you will see
- GetHashCode()is called for the first object (andrew). The object will be added to the HashSet (the Count property will increment to 1).
- GetHashCode()is called for the second object (rhonda). Then the Equals method is called using the object for andrew with the "other" object being "rhonda". The two objects don't match so the object will be added to the HashSet (the Count property will increment to 2).
- GetHashCode()is called for the third object (rhona), and then Equals is called using the object for rhonda with the "other" object being "rhona". The Equals method returns true - so the object isn't added.
public override int GetHashCode() { return this.employeeNumber.GetHashCode(); }Check that the test still passes. And if we go back to our code to output the HashCodes we see that rhona and rhonda both return the same value