tr ouwens

by the way: things I want to say

On transitivity

Last time, I discussed a transitivity issue for equals methods that also affects EqualsVerifier. To recap, consider this class:

class School {
    private final String name;
    private final String nickname;

    @Override
    public boolean equals(Object obj) {
        if (!(obj instanceof School)) {
            return false;
        }
        School other = (School)obj;
        return name.equals(other.name) || nickname.equals(other.nickname);
    }
}

This class violates the transitivity requirement: if x equals y, and if y equals z, then x must equal z. It’s easy to see why:

  • new School("A". "1") is equal to new School("B", "1")
  • new School("B", "1") is equal to new School("B", "2")
  • new School("A", "1") is NOT equal to new School("B", "2")

It turned out that EqualsVerifier was unable to detect this case. At the end of my post, I resolved to fix this issue quickly. Since I raised the issue on this blog, I felt I should publish the solution here as well. It took me a little while, but here’s the solution! I won’t bore you with the various detours I took while finding it and get straight to the point.

Approach

EqualsVerifier works by creating intances of classes the same way mocking frameworks like Mockito and EasyMock do: using bytecode trickery library Objenesis. Using reflection, it then ‘fills’ these instances with certain values. How this works exactly is a topic for another day, but it is important to know that for most types, EqualsVerifier either has 2 pre-fabricated values it can use for this, or it can construct these two values. In some cases, though, EqualsVerifier can’t construct any values, and the user has to supply them using the withPrefabValues method.

EqualsVerifier uses carefully chosen combinations of values to create instances, and calls equals and hashCode on these instances to find out whether certain properties of the contract are met. The idea, therefore, is to construct instances of a class in such a way that they expose our transitivity issue.

A partial solution

Let’s try to solve the case for 2 fields first. This relatively easy. Above, we created three instances of the School class that I called A1, B1 and B2, partly because I was too lazy to come up with actual names of schools, but also partly because this nicely shows how the issue can be approached for any class with two fields: we could have chosen any two values valid for the types at hand, as long as they are distributed according to the A1, B1, B2 pattern.

As mentioned above, transitivity requires that if x.equals(y) && y.equals(z), then also x.equals(z). The key insight here is that the order doesn’t matter: if y.equals(z) && z.equals(x) then y.equals(x) and if z.equals(x) && x.equals(y) then z.equals(y), etc. In other words, it doesn’t matter in which order we perform the equals checks.

So there are three checks we need to perform: x.equals(y), y.equals(z), and x.equals(z). (We’re going to assume that symmetry holds; if not, there’s another EqualsVerifier test that will catch that.) So, if two of these checks are true, the third one has to be true as well, or it’s non-transitive. In other words: if exactly two out of these three checks are true, equals is non-transitive. If all three are true, it’s ok; if only one or zero are true, it’s also ok. You can verify this easily in the following table:

A1==B1 && B1==B2 && A1==B2
A1==B1 && B1==B2 && A1!=B2  ← not ok
A1==B1 && B1!=B2 && A1==B2  ← not ok
A1==B1 && B1!=B2 && A1!=B2
A1!=B1 && B1==B2 && A1==B2  ← not ok
A1!=B1 && B1==B2 && A1!=B2
A1!=B1 && B1!=B2 && A1==B2
A1!=B1 && B1!=B2 && A1!=B2

The full solution

Now that we have the two-field solution, a more general solution is within reach: we simply have to pretend that the class only has two fields. How do we do that? Simple: we apply the two-field solution separately for each field in the class. We can iterate over the fields, where the current field is the ‘first’ value, and all other fields together represent the ‘second’ value. So if the second value changes, all fields (except the current one) change.

Let’s say we have a class with four fields, f, g, h and i. We’ll represent them with A/B, 1/2, α (alpha)/β (beta) and ا (alif) /ب (ba), respectively. Then we have to solve the following cases:

       fghi     fghi     fghi
f:  A1=A1αا  B1=B1αا  B2=B2βب
g:  A1=A1αا  B1=A2αا  B2=B2βب
h:  A1=A1αا  B1=A1βا  B2=B2βب
i:  A1=A1αا  B1=A1αب  B2=B2βب

Note how A1 and B2 are the same in all cases, and how the “B” ripples through the B1 values in the middle column.

If we perform the two-field solution for each of these four cases, we’re done. All of these cases must be transitive, as per the partial solution. If one of them isn’t, the equals method is bad and EqualsVerifier will say so.

Conclusion

I had no idea that Cedric Beust’s coding challenge would be such a challenge. But I’m glad that it was; EqualsVerifier benefitted from it, not just because a subtle bug was finally fixed, but also because I discovered several other areas for improvement along the way, which I will address in future releases.

In the mean time, I would like to urge you to update your pom files to version 1.2, and see if your equals methods are truly transitive! :)

Reaction to Cedric Beust's equals challenge

This week, I got nerd sniped by Cedric Beust’s latest coding challenge:

A School has either a name (“Stanford”) or a nickname (“UCSD”) or both.

Your task is to write equals() and hashCode() methods for the School class that satisfy the following requirement: two Schools are identical if they either have the same name or the same nickname.

I wrote EqualsVerifier; I should be able to do this! I came up with the following, which I thought was pretty ironclad:

public final class School {
    private final String name;
    private final String nickname;

    public School(String name, String nickname) {
        this.name = name;
        this.nickname = nickname;
    }

    @Override
    public boolean equals(Object obj) {
        if (!(obj instanceof School)) {
            return false;
        }
        School other = (School)obj;
        return nullSafeEqual(name, other.name) ||
                nullSafeEqual(nickname, other.nickname);
    }

    private boolean nullSafeEqual(String a, String b) {
        return a == null ? b == null : a.equals(b);
    }
    
    @Override
    public int hashCode() {
        return 42;
    }

    @Override
    public String toString() {
        return "School: name=" + name + ", nickname=" + nickname;
    }
}

I even added a toString()! Obviously, I wasn’t really happy about the hashCode() method, but I wasn’t really sure how to write it. Essentially, this implementation will turn your efficient hash collection with O(1) lookup into a list with O(n) lookup. So yeah, that’s pretty bad. But at least it meets the contract, so I thought I’d fix that later; first I wanted to see if my equals() was correct. So I defined some tests with EqualsVerifier:

@Test
public void testEquals() {
    School one = new School("A", "1");
    School two = new School("A", "2");
    School three = new School("B", "2");

    School x = new School("X", "0");

    EqualsVerifier.forRelaxedEqualExamples(one, two)
            .andUnequalExample(x)
            .verify();

    EqualsVerifier.forRelaxedEqualExamples(two, three)
            .andUnequalExample(x)
            .verify();
}

So far so good! I had to use the slightly verbose forRelaxedEqualExamples() mode of EqualsVerifier here, because the regular case is meant for classes that are equal only when all their fields are equal, which is obviously not the case here: the equality relation defined by Beust is more relaxed — hence the name.

Then I decided to try and combine the two calls to EqualsVerifier:

EqualsVerifier.forRelaxedEqualExamples(one, two, three)
        .andUnequalExample(x)
        .verify();

And that’s where my test failed: Precondition: not all objects are equal. And that’s true: one and three aren’t equal to each other. But they should be, because the contract clearly states:

It is transitive: for any non-null reference values x, y, and z, if x.equals(y) returns true and y.equals(z) returns true, then x.equals(z) should return true.

So there you have it: this equality relation isn’t transitive. Therefore, it’s not possible to write an equals() method that follows Beust’s requirement, and still meet the contract. Was the challenge a trick question?

This also explains why I had trouble with my hashCode() method: equal objects must always return the same hashCode. So if one and three are somehow equal, without having anything in common except the link with two, then the only possible hashCode is a constant.

Now you might ask: so what if my equals() method isn’t transitive? If this is the kind of equality semantics that I want, then why can’t I have it? It’s a free country!

And that’s true: you’re free to break the equals contract. But, as with any contract, if you break it, there will be consequences. In this case, the consequences will be subtle, hard-to-track bugs. One of the commenters had a good example of one such bug. Consider the following code:

Set<School> red = new HashSet<School>();
red.add(one);
red.add(two);
red.add(three);

Set<School> black = new HashSet<School>();
black.add(two);
black.add(one);
black.add(three);

System.out.println("Collection red has " + red.size() + " elements.");
System.out.println("Collection black has " + black.size() + " elements.");

Can you guess what that prints? Actually, you can’t, because it depends on the implementation of HashSet. When you add a key to a HashSet that’s already present in the set, the implementation can do two things. Either, it keeps the original one, in which case it will print:

Collection one has 2 elements.
Collection two has 1 elements.

Or it replaces the old one with the new one, in which case the order will be reversed.

The moral of the story: writing a good equals method is hard; don’t try to make it fancy.

If you want to have a non-transitive equality test, that’s fine, but don’t abuse equals() to achieve that. There are other ways. One of them, as Beust points out in his follow-up post, is to add an id field to the School class and use that to base equals() and hashCode() on. Another way could be to simply add another method to the class:

public boolean isSameSchoo(School other) {
    return nullSafeEqual(name, other.name) ||
            nullSafeEqual(nickname, other.nickname);
}

It even looks a lot better, because you don’t need all that type checking stuff anymore.

P.S.

Just a few paragraphs ago I was saying how I needed to use EqualsVerifier’s forRelaxedEqualExamples mode, because of the relaxed nature of this equality relation, and that the regular mode obviously wouldn’t work. I don’t know what made me decide to test that, anyway:

@Test
public void obviouslyThisTestShouldNotPass() {   // obviously!
    EqualsVerifier.forClass(School.class).verify();
}

It passed.

I’m going to have to fix that

Switching blogging platform - again

Those of you who are subscribed to my blog (hi mom!) might have already noticed from the old posts that suddenly showed up again in Google Reader last night: I’ve switched blogging platforms. Again.

About half a year ago, I blogged about how awesome Telegram was. And still is, by the way. But I wasn’t really happy with it. It takes away the hassle of generating, uploading, and hosting the site. Which is a good thing! But I didn’t feel in control: every time I wanted something off the beaten path, it would be hard, or sometimes even impossible, to make it happen. Even though support issues are resolved quickly, it simply didn’t feel mature enough for me yet. I’m not interested in living on the bleeding edge; I just want something that works. Finally, I didn’t like the existing skins, and I don’t have the skill, or the time to learn the skill, to create a nice skin myself.

I started looking for something I could manage myself, and I quickly found Jekyll (famous for being used by GitHub) and Octopress. (Octopress is to Jekyll as LaTeX is to TeX.) These are nice platforms, but they’re written in Ruby. I’ve fallen out of love with Ruby hard, and I refuse to install RVM or anything like it on my pc. And I couldn’t get Octopress to run without it (although, to be honest, I didn’t try very hard).

Also, Octopress requires you to fork their project, put your data in the same repository as the code, and then do a merge whenever Octopress has an update. This, in my very humble opinion, is horrible. Simply horrible.

So Octopress was out.

And then I found PieCrust. It hasn’t officially reached version 1.0 yet, but already it feels much more complete and mature than Telegram. It’s written in PHP, which isn’t sexy, but it doesn’t have to be. The small issues that I ran into were dealt with swiftly by its creator, Ludovic Chabant. Also, the website and the software have a consistent cooking metaphor running through them, which always brings a smile to my face. In short: it has all the advantages of Telegram, plus a few things I was missing there.

So here it is: my website, baked with PieCrust. Yes, it looks like an Octopress site. I found the skin as a PieCrust template on Google somewhere, and I intend to replace it just as soon as I learn some proper design skills.

Foobal: end-of-year, half-of-season update

Introduction

In a previous post, I promised to keep you updated on the progress of Foobal.

After about five or six weeks, Foobal was squarely in the lead of the family pool. However, after that, Foobal got more and more predictions wrong. It seems to be slightly biased towards tie games (a score of 1-1 was a particularly common prediction), which aren’t all that common in practice. So I decided to fix that. And of course, the two matches after that ended in a tie that Foobal sadly did not predict. Bummer. One of the matches would even have been predicted correctly had I not changed anything. Double bummer. After that, however, Foobal started to get some predictions right again. I’m confident that my changes will work out for the better. I won’t be deterred by statistical anomalies!

Currently, we’re halfway through the season, and the soccer competition is on winter hiatus. Every team has played every other team exactly once, and the second round will start at the end of January. Foobal currently holds a (shared) third place in the family pool, with 11 points out of a theoretically possible 38. The number one already has 18 points, wich will be very tough to beat!

Recap

Before I explain what I changed, I will give a quick recap of what was already there. Foobal allows for several ‘predicters’ to make a prediction. It will then take all these predictions, and take the median of the scores for each team. Note that the resulting prediction may (and probably will) be different from all of the individual predictions. Note, also, that taking the median implies making a ‘conservative’ prediction: the extreme scores simply don’t lie in the middle.

So far, I had implemented two rules. The first one simply ‘predicts’ the outcome of last year’s match, or 0-0 if the teams didn’t play each other last year. The second rule takes, for both teams, the average of the total number of goals scored in the preceding six matches.

The ‘pessimism factor’

I noticed that Foobal rarely predicted a zero score, even though it’s quite common in practice for one, or even both, teams to score no points in a given match. This was due to two facts: first, I only had two rules. Taking the median of two values is, by definition, taking the average of these values. So for Foobal to predict a zero score, both predictions had to be 0, or otherwise the average would be 0.5, which gets rounded up to 1.

Secondly, the ‘average of the last 6 matches’ rule also takes an average (obviously). For this rule to predict a zero, the average would have to be less than 0.5, or else it, too, would get rounded up to 1. The average would therefore have to be between 0.0 and 0.5 to return 0, and between 0.5 and 1.5 to return 1. That’s a ‘range’ of 0.5 points to get 0, and a full ‘range’ of 1.0 to get 1 (or higher). In other words, the chance of this rule predicting 0 is much smaller than for it to predict 1, making it all the more unlikely that the average of both rules would yield a 0, too.

To fix this issue, I introduced a ‘pessimism factor’. The rule now subtracts 0.5 from any given average, evening out the ranges. The chance of a 0 prediction is now equal to the chance of a 1 prediction. I realise that mathematicians might object to this kind of reasoning, but I’m not a mathematician, and this seems to work well enough :).

The leaderboard

The second way to make it more probable to predict a zero, is to add a third rule. That way, Foobal won’t have to take an average anymore, eliminating the rounding issues. Of course, taking the median means that two of the rules will have to predict a 0 for Foobal to predict zero as well, so it would be nice if this new rule occasionally predicts a zero, too. But what rule to add?

I decided to implement a rule that uses the soccer competition’s leaderboard. It looks up both teams, and gives the weaker team an automatic score of 0. The stronger team gets a score equal to half the distance between the two teams. If the teams are close, this results in a score of 0-0 or 1-0, which seems reasonable. If the teams are far apart, the rule might predict scores of up to 9-0. This is quite a big score, but since Foobal takes the median of all predictions, this score will never be the final prediction (unless one of the other rules predicts it, too). What it does, however, is nudge the final prediction towards the bigger of the predictions of the two other rules. This feels reasonable: this rule reflects the fact that a team will, generally, perform better against a weaker team, than against a weaker one.

Of course this meant that I had to implement a leaderboard, too. I could scrape this off a website the same way I scrape the outcomes of the individual matches, but I prefer to keep the scraping to a minimum, as it breaks easily. Besides, deriving a leaderboard from the individual outcomes seemed like a fun programming exercise! The leaderboard that I ended up with isn’t perfect, however: it ignores the number of goals scored when two teams are tied for the same position on the leaderboard. In such cases, the ‘winner’ is currently undecided. This means that Foobal’s leaderboard isn’t very accurate, as such ties are quite common in practice. At the moment of writing, there are five such ties on the official leaderboard, involving more than half of all the teams. However, I decided not to care: in practice, the strength of these tied teams won’t differ a lot, and it adds a small, controlled amount randomness to the whole thing, which I kind of like.

The home team advantage

Another idea that I had, but did not implement, was to give the home team an advantage over the visiting team. As I mentioned above, with two rules, Foobal would take the average of both rules’s predictions. By rounding the home team’s score up, and the visiting team’s score down, the home team would gain an advantage. However, somehow I felt this advantage would be too big. And anyway, the point became moot when I added the third rule. Maybe I’ll reconsider it if I decide to add a fourth rule, though.

Conclusion

So that’s what I did to improve Foobal’s predictions. Did I measure any of my claims before implementing them? Nope. Did I measure any of them after implementing them? Again: nope. This is something that I really should do, but haven’t gotten around to yet. Maybe when we start betting for money. Until then, the purpose of this project is merely to have fun, to learn something in the process, and maybe to scare the occasional uncle.

How to: make a Maven build script for Android

I’ve written before about writing Android apps in Scala. Sometimes, though, Scala isn’t the answer, and you need to fall back on Java. That doesn’t necessarily mean you should also fall all the way back to Ant, though. This post is about creating a Maven build script for an Android app written in Java.

But…why?

In order to have a repeatable build, one that you could also run from a build server, you need a build script. Building and deploying from Eclipse might work fine if you’re a single person on a fire-and-forget project, but once you’re working with more than one person, or if you need to go back to a project you were doing a few months ago on a different version of Eclipse, you’re going to want to have a proper build script.

But Android already provides an Ant script, why would you want to use Maven if you need to install and configure lots of stuff to it? Isn’t Ant good enough?

Yeah, for a simple project, it probably is. But once your project gets more complicated, it’s nice when Maven has your back. Especially if you have to manage dependencies (other than Android), Maven can be a life-saver. And this is coming from someone who has no particular love for Maven. Between Ant and Maven, Maven is, quite simply, the lesser of two evils.

Installing all the stuff

First, download Eclipse. I recommend Eclise Juno (4.2). If you use Indigo (3.7), you will need to install the M2Eclipse plugin as well.

NOTE: are you running Linux on a 64 bit machine? Then do this first: sudo apt-get install ia32-libs It’s somewhere in Google’s documentation that you should do this, but it’s easy to miss and it took me some googling before I figured this out.

Next, install the Android ADT plugin. It comes with a nice wizard that downloads the SDK for the platform(s) of your choice and helps you set up an emulator.

Then, you need to install m2e-android. The install process is described on m2e-android’s website. Don’t have the Eclipse Marketplace installed? You can get it here.

A new project

Instead of creating a new Android Project, you should create a new Maven Project. The first time, you need to configure a Maven archetype. Don’t worry, this is super easy and the m2e-android website explains very well how to do it. It also explains how you can convert an existing Android project to Maven.

Congratulations! You’re good to go now.

Running Maven from the command-line

As I explained above, you might want to run your build on a build server. If you do, or even if you simply want to run your build from the command-line, you need to do a couple extra things.

First, you need to install Maven 3, since it won’t work with Maven 2. Also, make sure it’s on your PATH. Check this by running mvn -version; it should say something like Apache Maven 3.0.4; not Apache Maven 2.2.1 (or some other version in the 2.x.y range).

And while you’re fiddling with environment variables, make sure that the ANDROID_HOME variable is set correctly as well. If you don’t know where your SDK is located, open Eclipse’s Preferences screen from the Window menu. You can find the SDK location in the Android tab.

To make things easier (even though, technically, it’s cheating), you can add your ADV to the Maven pom file. That way, Maven will automatically deploy your app to the given ADV, so you don’t have to provide it manually every time you run Maven. In the <build><plugins> section, find the android-maven-plugin. It has a <configuration> section, where you can add the following lines:

<emulator>
  <adv>my_adv_name</adv>
</emulator>

Obviously, you need to replace my_adv_name with the name of an ADV that you have actually configured.

Once all this is out of the way, you can finally run Maven from the command-line! Useful commands include:

mvn clean install
mvn android:deploy
mvn android:run

Be sure to run mvn clean install at least once before running mvn android:deploy or mvn android:run, or else you might start to see weird error messages about “trying to figure out package name from inside apk file”.

Conclusion

That’s it! Good luck! And if I missed something, please let me know. This kind of instructions tends to change over time, so while this works at the time of writing (which is September 16th, 2012), it might no longer work at the time of reading (which is, well, now).