November 21, 2008, Friday, 325

Reductio EqualsHashCode

From Workingmouse Wiki

Jump to: navigation, search

Contents

So you wanna know how to test Java's equals/hashCode huh?

Here is a true statement that should hold for any equals and hashCode implementation:

forall x. forall y. x.equals(y) => x.hashCode() == y.hashCode()

You can read => as implies. So if x equals y, then this implies that the hash code of x is equal to the hash code of y. This is called logical implication and is often written in English as "if p then q", for some p and q, just like I did in the previous statement. Easy peasey eh? Here is the truth table for logical implication:

p    q    p => q

0    0    1
0    1    1
1    0    0
1    1    1

This looks a bit counter-intuitive at first, especially the second entry, but with a bit more thought, you might be able to see why it is so. If p is false and q is true, then p implies q is also true. If we altered the truth table such that 'false implies true' were false, then the table would be for logical equivalence (also known as the biconditional). Confusing implication and equivalence is one of the most common mistakes that I see in general discussion, so it really helps to understand the distinction. If the hash code of x is equal to the hash code of y, then does not imply that x equals y. In other words, the converse of the original statement is not true. To suggest it is so is to make this mistake. Anyway, on with the story on testing this property...

I will use Java 7 BGGA syntax here, however, all of these examples have an equivalent using Java 1.5 syntax, which is a little more verbose, so I will leave it as an exercise for the reader ;) (really though, it's very straight-forward; replace a closure with an instance of the fj.F interface, which contains one method).

I will start with a typical example of a user-defined class called MyClass that has an equals and hashCode implementation. It's very straight-forward:

public final class MyClass {
  private final byte b;
  private final String s;

  MyClass(final byte b, final String s) {
    this.b = b;
    this.s = s;
  }

  public byte b() { return b; }
  public String s() { return s; }

  public boolean equals(final Object o) {
    return o != null &&
        o.getClass() == MyClass.class &&
        b == ((MyClass)o).b &&
        s.equals(((MyClass)o).s);
  }

  public int hashCode() {
    final int p = 419;
    int result = 239;
    result = p * result + b;
    result = p * result + s.hashCode();
    return result;
  }
}

How might we test the equals and hashCode implementation of MyClass? If we were using JUnit (or some other non-automated test tool):

assertEqual(new MyClass(7, "abc").hashCode(), new MyClass(7, "abc").hashCode());
assertEqual(new MyClass(42, "CBA").hashCode(), new MyClass(42, "CBA").hashCode());
... etc. etc.

What we would really like to say is something like: "I assert that for any two equal instances of MyClass (call them x and y), then the hashCode of x is equal to the hashCode of y. Dear test tool (Reductio!), please try your best to show me why this is false if you find it to be so". Well, that's just an English way of saying the logical implication statement above, right? We could write this statement in Java (boiler-plate omitted):

Property p = property(arbMyClass, arbMyClass, { MyClass m1, MyClass m2 =>
      bool(m1.equals(m2)).implies(m1.hashCode() == m2.hashCode()) })

I'll give you the same code in Java 1.5, but just this once :)

Property p = property(arbMyClass, arbMyClass, new F2<MyClass, MyClass, Property>() {
    public Property f(MyClass m1, MyClass m2) {
      return bool(m1.equals(m2)).implies(m1.hashCode() == m2.hashCode());
    }
  });

A naive instance for arbMyClass would typically make use of the fact that Arbitrary is a monad (if you don't know what that means, then all you need to know is that we are using a method called bind and it really helps us ;)):

Arbitrary<MyClass> arbMyClass = arbitrary(arbByte.gen.bind(arbString.gen,
      { byte b => { String s => new MyClass(b, s) }} ));

All done and dusted; just run it, observe a few hundred or so passing unit tests and then go home, right? Nup, sorry. If you try running this (and I tried myself), you'll be waiting quite some time before the tests complete execution. This is because you have to wait for the generator to produce two equal instances of MyClass before just one test executes! The instances of MyClass are as arbitrary as the byte and String values that are generated and that is quite arbitrary.

You could cheat. You could adjust the test execution parameters to run say, only 5 unit tests and you might see your test finish execution before lunch time. But that's very diluted testing and you may as well be using manual testing.

Thankfully, Reductio makes it easy to adjust our level of arbitrariness. All we need to do is write an Arbitrary for MyClass that is more likely to produce equal instances. Sound hard? No, it's easy. First, we write an Arbitrary for byte that generates a subset of possible values. In this case, 1 of 3 values is generated:

Arbitrary<Byte> arbByteR = arbitrary(arbByte.gen.map({ byte b => (byte)(b % 3) }));

Next we do the same for String. We will select 1 from 12 (2*3*2) possible String values (we add 'a' to the result so that it generates nice, printable values, just because we can!):

Arbitrary<String> arbStringR = arbitrary(arbCharacter.gen.bind(arbCharacter.gen, arbCharacter.gen,
    { char c1 => { char c2 => { char c3 =>
          new String(new char[]{(char)(c1 % 2 + 'a'), (char)(c2 % 3 + 'a'), (char)(c3 % 2 + 'a')}) }}}));

Now, we can write the Arbitrary for MyClass using these two more restrictive implementations. We use the same code as earlier, but notice that the Arbitrary instances for byte and String have slightly different names that the standard ones provided by Reductio (suffixed with 'R'):

Arbitrary<MyClass> arbMyClass = arbitrary(arbByteR.gen.bind(arbStringR.gen,
      { byte b => { String s => new MyClass(b, s) }} ));

We will have to adjust the test parameters so that the number of minimum successful tests has a reasonable proportion to the number of maximum discarded tests (those that do not satisfy our premise that the two instances are equal). We can determine that there is a 1 in 36 chance that two generated instances of MyClass are equal, so if we are executing 100 minimum successful tests, then 10,000 maximum discarded tests should be sufficient to prevent an exhaustion when we execute these tests. We use the earlier declared Property p, which if you remember, accepted two arguments as arbMyClass. Let's print the result!

summary.println(p.check(100, 10000, 0, 100));

If you hold your left index finger in your right ear, you might see a result like this:

OK, passed 100 tests (4776 discarded).

Yippee! If you hold your right finger in your left ear, you might see this:

OK, passed 100 tests (4892 discarded).

The important point (regardless of which finger and which ear) is that we aren't approaching our maximum discarded 10,000 unit tests. Doing so would result in a message like this:

Arguments exhausted after <n> tests.

...where n is a value less than 100.

So there you have it, automated unit testing Java's equals and hashCode methods using Reductio ;) OK, so are we finished showing off yet? Sorry, but no :)

Here is another interesting observation: we have documented the specification for the relationship between equals and hashCode. We can read in the tests that: x.equals(y).implies(x.hashCode() == y.hashCode()). Can't ask for better documentation than that can you? OK, I think that's enough showing off (for now anyway).

Got any questions? Please feel free to join the Reductio mailing list and ask. Our list members are always welcoming to beginners and we encourage questions at any level of understanding, so please, please don't be afraid to ask!

You can find a complete compilable and runnable example using Java 7 BGGA syntax below and the same example that uses Java 1.5 below that.


Links


Example (Java 7 BGGA)

import reductio.Arbitrary;
import static reductio.Arbitrary.arbByte;
import static reductio.Arbitrary.arbCharacter;
import static reductio.Arbitrary.arbitrary;
import static reductio.Bool.bool;
import static reductio.CheckResult.summary;
import reductio.Property;
import static reductio.Property.property;

public final class EqualsHashCode {
  public static final class MyClass {
    private final byte b;
    private final String s;

    MyClass(final byte b, final String s) {
      this.b = b;
      this.s = s;
    }

    public byte b() { return b; }
    public String s() { return s; }

    public boolean equals(final Object o) {
      return o != null &&
          o.getClass() == MyClass.class &&
          b == ((MyClass)o).b &&
          s.equals(((MyClass)o).s);
    }

    public int hashCode() {
      final int p = 419;
      int result = 239;
      result = p * result + b;
      result = p * result + s.hashCode();
      return result;
    }
  }

  public static void main(final String[] args) {
    final Arbitrary<Byte> arbByteR = arbitrary(arbByte.gen.map({ byte b => (byte)(b % 3) }));

    final Arbitrary<String> arbStringR = arbitrary(arbCharacter.gen.bind(arbCharacter.gen, arbCharacter.gen,
      { char c1 => { char c2 => { char c3 =>
          new String(new char[]{(char)(c1 % 2 + 'a'), (char)(c2 % 3 + 'a'), (char)(c3 % 2 + 'a')}) }}}));

    final Arbitrary<MyClass> arbMyClass = arbitrary(arbByteR.gen.bind(arbStringR.gen,
      { byte b => { String s => new MyClass(b, s) }} ));

    final Property p = property(arbMyClass, arbMyClass, { MyClass m1, MyClass m2 =>
      bool(m1.equals(m2)).implies(m1.hashCode() == m2.hashCode()) });
    summary.println(p.check(100, 10000, 0, 100)); // OK, passed 100 tests (4776 discarded).
  }
}

Example (Java 1.5)

import fj.F;
import fj.F2;
import fj.F3;
import static fj.Function.curry;
import reductio.Arbitrary;
import static reductio.Arbitrary.arbByte;
import static reductio.Arbitrary.arbCharacter;
import static reductio.Arbitrary.arbitrary;
import static reductio.Bool.bool;
import static reductio.CheckResult.summary;
import reductio.Property;
import static reductio.Property.property;

public final class EqualsHashCode {
  public static final class MyClass {
    private final byte b;
    private final String s;

    MyClass(final byte b, final String s) {
      this.b = b;
      this.s = s;
    }

    public byte b() { return b; }
    public String s() { return s; }

    public boolean equals(final Object o) {
      return o != null &&
          o.getClass() == MyClass.class &&
          b == ((MyClass)o).b &&
          s.equals(((MyClass)o).s);
    }

    public int hashCode() {
      final int p = 419;
      int result = 239;
      result = p * result + b;
      result = p * result + s.hashCode();
      return result;
    }
  }

  public static void main(final String[] args) {
    final Arbitrary<Byte> arbByteR = arbitrary(arbByte.gen.map(new F<Byte, Byte>() {
      public Byte f(final Byte b) {
        return (byte)(b % 3);
      }
    }));

    final Arbitrary<String> arbStringR = arbitrary(arbCharacter.gen.bind(arbCharacter.gen, arbCharacter.gen, curry(new F3<Character, Character, Character, String>() {
      public String f(final Character c1, final Character c2, final Character c3) {
        return new String(new char[]{(char)(c1 % 2 + 'a'), (char)(c2 % 3 + 'a'), (char)(c3 % 2 + 'a')});
      }
    })));

    final Arbitrary<MyClass> arbMyClass = arbitrary(arbByteR.gen.bind(arbStringR.gen, curry(new F2<Byte, String, MyClass>() {
      public MyClass f(final Byte b, final String s) {
        return new MyClass(b, s);
      }
    })));

    final Property p = property(arbMyClass, arbMyClass, new F2<MyClass, MyClass, Property>() {
      public Property f(final MyClass m1, final MyClass m2) {
        return bool(m1.equals(m2)).implies(m1.hashCode() == m2.hashCode());
      }
    });
    summary.println(p.check(100, 10000, 0, 100)); // OK, passed 100 tests (4776 discarded).
  }
}