tr ouwens

by the way: things I want to say

Hacking Java enums

The other day, I was debugging some enum related code in EqualsVerifier. I had this enum:

enum CallMe { YES, NO, MAYBE }

And two variables, original and clone, containing a value of said enum. Here’s what the bug looked like in Eclipse’s debugger:

Call Me Maybe

So, what do we see here? We see two variables of type EnumHack$CallMe (the enum was an inner class, so that makes sense). Both enums have the same name and ordinal, so they are equal. They also have different ids (33 and 34, respectively), so they’re not the same object.

Wait, what!?

That’s right: two different instances of the same enum constant. In other words, original.equals(clone) returns false, even though both variables are set to CallMe.MAYBE. How is that even possible? I thought it wasn’t. In the words of Joshua Bloch:

This approach [of using a one-element enum to implement a singleton, JO] provides an ironclad guarantee against multiple instantiation, even in the face of sophisticated serialization or reflection attacks.

The Java Language Specification, section 8.9 says this:

An enum type has no instances other than those defined by its enum constants. It is a compile-time error to attempt to explicitly instantiate an enum type (§15.9.1).

The final clone method in Enum ensures that enum constants can never be cloned, and the special treatment by the serialization mechanism ensures that duplicate instances are never created as a result of deserialization. Reflective instantiation of enum types is prohibited. Together, these four things ensure that no instances of an enum type exist beyond those defined by the enum constants.

Clearly, despite the fact that an enum constant can never be cloned, I had a clone on my hands.

So what happened? I traced the problem back to this:

public void hackAnEnum() throws Exception {
    CallMe original = CallMe.MAYBE;
    CallMe clone = ObjectAccessor.of(original).copy();
    assertEquals(original.ordinal(), clone.ordinal());
    assertFalse(original == clone);

I added the asserts so you can see for yourself what’s going on: if you copy/paste this to your IDE and put EqualsVerifier on the classpath, you’ll be able to run it. This test passes, which means that both original.equals(clone) and original == clone are, indeed, false.

So what does this ObjectAccessor do? It’s part of EqualsVerifier’s reflection library, and as you probably guessed, it makes a copy of the given object. If I factor out all EqualsVerifier code, I end up with this:

CallMe clone = new ObjenesisStd().newInstance(CallMe.class);
for (Field f : Enum.class.getDeclaredFields()) {
    f.set(clone, f.get(original));

Apart from the reference to ObjenesisStd, this is all standard Java reflection code. So, what is this Objenesis thing, then? From their website:

Objenesis is a small Java library that serves one purpose: To instantiate a new object of a particular class.

In other words, it can instantiate any object, without calling its constructor. So how does Objenesis work, exactly? Well, that depends. It uses the Strategy pattern to choose from several different ways of instantiating objects, depending on what kind of JVM you’re running, and probably some other factors, too. In my case, it expanded to this:

Constructor<Object> objectConstructor =
    Object.class.getConstructor((Class[]) null);
Class<?> reflectionFactoryClass =
Method method = reflectionFactoryClass.getDeclaredMethod("getReflectionFactory");
Object reflectionFactory = method.invoke(null);
Method newConstructorForSerializationMethod =
    reflectionFactoryClass.getDeclaredMethod("newConstructorForSerialization", Class.class, Constructor.class);
Constructor<CallMe> ctr = (Constructor<CallMe>)
    newConstructorForSerializationMethod.invoke(reflectionFactory, CallMe.class, objectConstructor);
CallMe clone = ctr.newInstance((Object[]) null);

That’s some incredibly hairy scary code. Let’s pretend we never saw this. The only thing we need to remember is that all this can be done without resorting to actually changing the bytecode at runtime; it’s all reflection.

While Objenesis maybe a relatively unknown library, it is actually pretty widely used. The most famous libraries that use it are Mockito (and basically all mocking frameworks), and Spring Framework. But there are more. Many more. EqualsVerifier uses it to instantiate values for the fields of the class it’s testing.

OK, now we that know this, can we go one further? It turns out we can:

Call Me Sometime

We can add our own enum constants. Here’s how:

CallMe sometime = new ObjenesisStd().newInstance(CallMe.class);
Field ordinal = Enum.class.getDeclaredField("ordinal");
ordinal.set(sometime, 4);
Field name = Enum.class.getDeclaredField("name");
name.set(sometime, "SOMETIME");

It’s actually pretty simple. I guess the JVM’s guarantees aren’t ironclad enough for this particular sophisticated reflection attack. And to think I actually found it by accident! I fixed the bug in EqualsVerifier before it ever got released into production, so all’s well that ends well, I guess.

P.S. Oh and don’t try this at home kids! …or at least, not in production.

Configuring my NAS

The other day, I had to re-install my NAS, and of course, I’d forgotten exactly how I had configured it. That meant finding back lots of websites, some of which apparently didn’t exist anymore. This time, I’ve written down all the steps I’ve taken, just in case, for some sad, unforeseen reason, I ever have to do it again.

What do I want from my NAS? Not much, actually:

  • I want it to store my files (duh)
  • I want to use an rsync script to copy my photos to my NAS
  • I want filenames with special characters (éüå) to be treated sanely by said rsync script
  • I want it to run CrashPlan so all my data is automatically backed up to the cloud

I don’t need it to be a media server; I have Spotify and Netflix for that.

My NAS is a Synology DS213j running DSM 5.1.

Starting out

After choosing a server name, a user name and a password, make sure to first install all the updates from the Control Panel. Add the user account to the administrators group, then disable admin and guest account. Next, create a shared folder: in File Station, click Create, Create New Shared Folder. Give it a name and click OK. Give your user Read/Write permissions to the folder.

Set up SSH

Once the preliminaries are out of the way, you can set up SSH, which is needed for rsync.

  • Go to Control Panel, “Terminal & SNMP” and check “Enable SSH service”
  • Go to Control Panel, User, Advanced, and check “Enable user home service”. This is needed to be able to log in as a different user than root later.

You now have root SSH access though the Admin account. (You disabled that account earlier, but root is still active, and its password is the same as the NAS’s admin account’s password.) Let’s try it from the command-line: ssh root@diskstation. You have to type ‘yes’, then the admin password, and you’re in. Nice.

You want to be able to log in using the new user account you defined earlier. From a root SSH session:

  • Open /etc/passwd with vi.
  • Find the line for your user, and at the end of that line, replace /sbin/nologin with /bin/sh.
  • Run chmod u+s /bin/busybox, or else you won’t be able to gain root access later. (You’ll get errors like “must be suid to work properly” when you try to run su.) In fact, I’ve lost root once after an update because the permissions of this file were reset, so I invite you to add it to the crontab as well:
  • vi /etc/crontab, then add the line 0 0 * * * root chmod u+s /bin/busybox. That will restore the permission every day at midnight.
  • Restart the diskstation to make sure that the new crontab is loaded. You may want to test this to convince yourself it works, before you proceed, because if something goes wrong, you’ll be locked out of your NAS.

Now, you really should disable direct root access. Perform the following steps carefully, or you might get locked out of root forever.

  • vi /etc/ssh/sshd_config and change the line #PermitRootLogin yes to PermitRootLogin no.
  • Let’s also change #MaxAuthTries 6 to MaxAuthTries 3.

If you want to beef up security a bit more with SSH keys, read this article.

Enable special characters

Not all Synology diskstations support UTF-8 on the terminal, which is where you want to run rsync. So if you have files or folders with non-ASCII characters in them, you may want to make sure that your diskstation does support them, or else you’ll run into trouble later.

SSH into your diskstation and type locale. If it looks like this, you’re good:


I.e., en_US.utf8 should be everywhere. If it’s not, go to this blog post and follow the instructions to fix it. It’s a bit of work, but it’s worth it, I promise.

Enable rsync

  • Go to the diskstation’s main menu and start the app “Backup & Replication”.
  • Go to the Backup Services tab.
  • Check “Enable network backup service”.
  • Click Apply.

Now you can rsync from the pc to the NAS like this (assuming your shared folder is called documents):

rsync --delete -av --exclude ".DS_Store" --exclude ".fseventsd"
      --exclude ".Spotlight-V100" --exclude ".TemporaryItems"
      --exclude ".Trashes" --exclude "@eaDir"
      /some/directory/ user_name@diskstation:/volume1/documents

Install CrashPlan

  • Open the Package Center’s Settings, and change the Trust Level to “Any publisher”.
  • Just follow Scott Hanselman’s instructions. He explains it much better than I could. Note that, if you’re on a Mac, CrashPlan’s configuration file can be found in /Applications/

The CrashPlan client prevents the diskstation from going into hibernation mode. Therefore, it may be a good idea to schedule it to run only for a few hours a day. Log in via SSH and gain root via su. Then vi /etc/crontab and add the following two lines:

0 1 * * * root /var/packages/CrashPlan/scripts/start-stop-status start
0 5 * * * root /var/packages/CrashPlan/scripts/start-stop-status stop

That starts up the service at 01:00 and shuts it back down at 05:00.

Wrapping up

Restart the diskstation one final time, to make sure all settings have taken effect.

That’s it! We’re done.

It's just test code

“It’s just test code” is something I hear way too often, from people who really should know better. Somehow, there is a pervasive sentiment that unit test code does not need to be held to the same standards as production code. I’m not sure why, but it seems to be associated with the idea that the “customer” will never “see” that code.

This kind of thinking is dangerous. As your code base grows, badly maintained test code will slow you down at least as much as badly maintained production code does.

I’ve seen tests that were so terse that I couldn’t figure out what they were testing. I’ve seen tests that were so long and convoluted that I couldn’t figure out how to fix them after they broke when I changed something in the production code. I’ve seen suites of tests that were so badly organized that I couldn’t figure out where to find the tests for a specific feature. All of these were written by people who are paid to write this stuff. Some of them were even me!

For example, EqualsVerifier’s test suite had devolved over the years into a state that it was virtually unmaintainable. Fixing bugs had become very hard and time-consuming. This was not because the production code was hard to understand; I’d kept that sufficiently well-factored. No, it was because it had become increasingly hard to fix the existing tests after a change, and to find a good location for a new test. When I did a training with Michael Feathers last year, I asked him how I could best fix this. His advice was basically to start over from scratch.

I did not start over from scratch. I did, however, spend several weeks cleaning up and heavily refactoring the test code. Here are some things I’ve learned from that experience, and also from experience on other projects. I’ll use a running example of classes modeling a card game.

Test naming

This is the most important thing. You might understand what a test is supposed to test when you’re writing it, but will you still understand it in three months? Will somebody else?

For example, one of your test methods might be named testShuffle(). First of all, we’re not living in the nineties anymore. With JUnit 4’s @Test, there’s no need to prefix all your test methods with the word test. It’s just clutter, so get rid of it. Second of all, I can see that this test probably performs a test on the shuffle() method. But what does it test, exactly? The happy path? If so, which happy path? Maybe it even tests all possible interactions with this method? If so, it’ll be quite a long test method, I suppose.

Another example of a badly named test is shuffleFails(). Why does it fail? And why is it supposed to? How am I supposed to know? I could look at the code, but even that might not tell me everything I need to know. It might be specified behavior, or an observation of a known bug that I want to change later.

For EqualsVerifier, I’ve started using a modified “given-when-then” style for naming my test methods. I reverse the order to “then-when-given”, and I often leave out the “given” or “when” part when I don’t need them. I like to start the name with the method I’m exercising (if appropriate), and I separate the parts with an underscore for extra readability. For example: shuffleReshufflesTheDeck_givenAPreShuffledDeck(), dealGivesThreeCards_whenRequestingThreeCards_givenAShuffledDeck(), dealThrowsIllegalStateException_whenCalled_givenAnUnshuffledDeck().

Yes, those are long Java names, but it beats having a shorter name and putting the “given-when-then” description in a comment for several reasons. First of all, you won’t have to invent a short name for your test in addition to the long “given-when-then” name. Naming things is hard, so this is one less thing you need to name. Secondly, the method name is picked up by your test runner, while any comments are not. So if a test fails, you immediately know what’s up; you don’t need to go to the test to read the “given-when-then”.

In languages and/or framworks that support it, I always prefer a string over an identifier to name my tests. For example, in ScalaTest you can also write:

"shuffle" should "reshuffle the deck given a pre-shuffled deck" in {
  // ...

Obviously, this is much more readable than a camel-cased Java identifier.

Magic numbers

Many people believe magic numbers are OK in unit tests. I think, however, that introducing constants can be very beneficial in making a test more understandable. For example:

@Rule ExpectedException thrown = ExpectedException.none();

public void dealThrowsISE_whenAsking79Cards() {

public void dealThrowsISE_whenAskingTooManyCards() {
    thrown.expect(IllegalStateException.class); + 1);

Not everybody might immediately recognize that 79 is not some aribitrarily large number, but that it is, in fact, precisely one more than the number of cards in the French game of Tarot. In the second example, it’s immediately clear what’s going on. You don’t even need to see the actual value for NUMBER_OF_CARDS_IN_A_DECK.

Here’s another example:

public void dealGivesTheExactNumberOfCardsAsked_whenAskingSomeNumber() {
    List<Card> hand =;
    assertThat(hand.size(), is(SOME_NUMBER));

public void dealGivesAFullHandOfCards_whenAskingTheNumberOfCardsInAHand() {
    List<Card> hand =;
    assertThat(hand.size(), is(NUMBER_OF_CARDS_IN_A_HAND));

The difference may be subtle, but in the first case, I don’t care about the specific number as long as it’s the same in both instances, whereas in the second one, I do care about the specific number. The first test looks like it’s part of a test suite for a generalized CardDeck class, while the second one might be specific to a particular game.

Note that magic numbers aren’t necessarily numbers; they can be Strings or even entire objects too. When I don’t care about the specific magic number, I like to prefix it with the word “Some”.


As stated above, I’ve seen single test methods that were over twenty lines in length. They often contain comments which may or may not have gotten stale, lots of initialization code that may or may not be copy-pasted over several different test methods in slightly modified forms, and lots and lots of assertions checking tiny little fields on a big object. When one of these tests fails, or you’ve made a change in the production code that requires the test to change as well, it might take a very long time to figure out what’s going on in that test, what it’s actually doing, and what it should be doing. Also, if one assert fails, you don’t know in advance how many of the following asserts in the method will fail as well. You don’t have all the information that you should be having.

In other words, these tests tend to be very brittle, leading to lots of frustrating time wasted in maintaining them. So how to fix this smell?

It should be obvious, really. Test code is code, so you can refactor it. Move initialization code to an @Before set-up method, or to a private helper method. Give the helper method parameters if you have different flavors of initialization. Big assert blocks can be moved to a helper too. So instead of this:

    assertThat(hand.get(0).getValue(), is(Value.ACE));
    assertThat(hand.get(0).getSuit(), is(Suite.SPADES));

You should have something like this:

    assertCard(hand.get(0), Value.ACE, Suite.SPADES);

// ...

private void assertCard(Card actualCard, Value expectedValue, Suite expectedSuite) {
    assertThat(actualHand.getValue(), is(expectedValue));
    assertThat(actualHand.getSuit(), is(expectedSuite));

However, as I said, you shouldn’t be asserting too much in a single test, because you’ll lose information. If the first assert fails, you no longer know what the rest will do. It might be the difference between a typo and a major design flaw and you won’t know until you’ve fixed the first assertion and the second one can be executed, and the third, etc. But when you refactor this test into several smaller ones, you’ll immediately see how big your problems are just by the number of tests that are failing.

By refactoring mercilessly, your test methods will become shorter and clearer. It will become easier to understand what a test does, and it will become easier to maintain a whole suite of tests, because if something changes in the initialization, there’s now only one place where you need to make that change, instead of lots of different individual test methods.

Additionally, remember those comments that may or may not have gotten stale? If you’ve named your helper methods well, you probably don’t need those comments anymore. Just sayin’.

This is what I mean when I say that test code is code, too. You don’t tolerate excessive copy-pasting in production code; why would you tolerate it in test code?

Arrange, act, assert

This is the same thing as “given-when-then”, and it’s a good way to organize your test method. Consider:

public void dealGivesKnownCards_whenShufflingWithAKnownSeed() {
    Random rand = new Random(TEST_SEED);
    CardDeck deck = new CardDeck(rand);
    assertThat(deck.size(), is(52));
    List<Card> hand =;
    Card one = hand.get(0);
    assertCard(one, Value.ACE, Suite.SPADES);
    Card two = hand.get(1);
    assertCard(two, Value.QUEEN, Suite.HEARTS);


public void dealGivesKnownCards_whenShufflingWithAKnownSeed() {
    // Arrange
    Random rand = new Random(TEST_SEED);
    CardDeck deck = new CardDeck(rand);

    // Act
    List<Card> hand =;
    Card one = hand.get(0);
    Card two = hand.get(1);

    // Assert
    assertThat(deck.size(), is(52));
    assertCard(one, Value.ACE, Suite.SPADES);
    assertCard(two, Value.QUEEN, Suite.HEARTS);

This is obviously a contrived example, but in general, it’s nice to always keep tests in the same order, so you will always know where to find things in tests that you may not have seen in a while. Note that you don’t always need an Arrange section, as it might be covered by the @Before set-up method. If I’m unable to disentangle your test in this way, I see it as a signal that I need to refactor the test into several smaller ones. If any of the blocks becomes more than a few lines in length, I will start extracting methods.

By the way, I don’t actually write the comments in real life, I added them here for educational purposes.

Static analysis

If you use static analysis tools like FindBugs or CheckStyle, do you apply them to your test code as well? Because you should. Use the same configuration that you use for your production code, but with a few exceptions. For example, in test code, it’s ok if test methods throw Exception, and test classes can have non-static public fields (as they are required for JUnit’s Rules). Therefore, you can disable the corresponding CheckStyle checks for your test code.


Test code is not “just” test code; you should treat it as if it were production code, and keep it to the same standards. Refactor it. Take care of it. Love it. If you do, you will end up with a test suite that you can actually maintain. Future-you will love you for it.

The things we do for compatibility

For EqualsVerifier’s new 1.5 release, I faced a dilemma. EqualsVerifier should support Java 8, but it also should still run under Java 6 and 7. Preferably in a single code base, because maintaining multiple code bases is a hassle (even if it means I get to use lambdas in one of them). Also, there should be unit tests targeting Java 8-specific classes: does EqualsVerifier support classes that contain lambdas? does EqualsVerifier support classes with fields of type, say, java.time.ZonedDateTime? These tests should run on Java 8 but should not break on Java 6. Can this even be done?

The answer is: yes. Yes, it can be done. Here’s how.


It turns out that Java 6 introduced the interface. You can use it, usurprisingly, to compile Java classes at runtime, like so:

private void compileClass(File sourceFile, File tempFolder) throws IOException {
    JavaCompiler compiler = ToolProvider.getSystemJavaCompiler();
    StandardJavaFileManager fileManager = null;
    try {
        fileManager = compiler.getStandardFileManager(null, null, null);
        fileManager.setLocation(StandardLocation.CLASS_OUTPUT, Arrays.asList(tempFolder));
        Iterable<? extends JavaFileObject> javaFileObjects = fileManager.getJavaFileObjectsFromFiles(Arrays.asList(sourceFile));
        CompilationTask task = compiler.getTask(null, fileManager, null, null, null, javaFileObjects);

        boolean success =;
        if (!success) {
            throw new AssertionError("Could not compile the class");
    finally {
        if (fileManager != null) {

Note that we can’t use try-with-resources because of EqualsVerifier’s Java 6 compatibility requirement.

This code assumes that sourceFile is a File reference to a Java source file. It will compile the file and write it to the tempFolder. The filename will be identical to the source file, but with a .class extension instead of a .java extension.

Also, any compile errors are written to the console. Not ideal, but I haven’t tried yet to redirect them so I can show the in the AssertionError somehow. I might do that for a future revision though.


So, now we have a .class file somewhere on our filesystem. However, it’s not on the classpath yet, the JVM doesn’t automagically load it, and we don’t have a Class<?> variable referencing it. So how do we use it? This is where comes in:

private URLClassLoader createClassLoader(File tempFolder) {
    try {
        URL[] urls = { tempFolder.toURI().toURL() };
        return new URLClassLoader(urls);
    catch (MalformedURLException e) {
        throw new AssertionError(e);

Note that, as of Java 7, the URLClassLoader implements Closeable, which means it has a close() method that needs to be called when we’re done. It’s not Closeable yet in Java 6, so we’ll have to call close() using reflection. I’ll leave it as an exercise to you, my esteemed reader, to figure out how to do that.

The important thing is: now we have a class loader that we can use to load our class and pass it to EqualsVerifier:

Class<?> type = createClassLoader(tempFolder);

Note that this code adds the entire contents of the tempFolder file to the classpath, so it’s wise to create a fresh, empty directory for this. Since I use this code only in unit tests, I use JUnit’s TemporaryFolder rule to manage this.

Tying it together

The unit test simply contains a raw String:

private static final String JAVA_8_CLASS =
    "\nimport java.util.List;" +
    "\nimport java.util.Objects;" +
    "\n" +
    "\npublic final class Java8Class {" +
    "\n    private final List<Object> objects;" +
    "\n    " +
    "\n    public Java8Class(List<Object> objects) {" +
    "\n        this.objects = objects;" +
    "\n    }" +
    "\n    " +
    "\n    public void doSomethingWithStreams() {" +
    "\n;" +
    "\n    }" +
    "\n    " +
    "\n    @Override" +
    "\n    public boolean equals(Object obj) {" +
    "\n        if (!(obj instanceof Java8Class)) {" +
    "\n            return false;" +
    "\n        }" +
    "\n        return objects == ((Java8Class)obj).objects;" +
    "\n    }" +
    "\n    " +
    "\n    @Override" +
    "\n    public int hashCode() {" +
    "\n        return Objects.hash(objects);" +
    "\n    }" +

We can write this String to a (in the same tempFolder directory mentioned above), making sure that it’s name is the name of the class with a .java extension. In this case, that would be Then we pass the File reference to the compileClass method defined above, and the circle is complete.

Java 6

Now what about Java 6? We haven’t used any API calls that aren’t available in Java 6, so that’s good, but obviously the Java8Class string won’t compile. We don’t want our test to fail on that. We can solve this by simply detecting if the test is running on a Java 8 JVM, and if it’s not, simply return. How we do this? Well…

public boolean isTypeAvailable(String fullyQualifiedTypeName) {
    try {
        return true;
    catch (ClassNotFoundException e) {
        return false;

// ...

if (!isTypeAvailable("java.util.Optional")) {

java.util.Optional was introduced in Java 8, so if it’s on the classpath, we know we’re running Java 8 (or higher). It’s a bit of a hack, I know, but to me it felt more reliable than checking Java system properties. And the whole thing is obviously a huge hack anyway, so what’s one more, right? :)

Java 8 API classes

So that takes care of classes containing lambdas, streams, and other Java 8 language features. But we’re not done yet, because what about classes containing fields of a type that wat introduced in the Java 8 API? For example, Java 8 introduced the new Java Time API, and some other new classes as well (such as Optional which we abused above). Some of these are defined recursively, meaning EqualsVerifier can’t instantiate them without a little help, so we need to find a way to instantiate these classes and add them to EqualsVerifier’s prefabValues.

We know in advance which classes we need to add to EqualsVerifier’s prefab values (we can simply try them out and make an inventory list), and we also know how to instantiate them (that’s part of the API, after all). we just can’t call the constructor directly, because the class may or may not be on the classpath, depending on the JVM version currently running. Reflection to the rescue!

It turns out there are 3 main ways an instance of a class can be retrieved: through calling its constructor, through calling a static factory method defined on the same class, or through referencing a static constant defined on the class. Since reflection is even more verbose than vanilla Java, I’ve hidden all this away in a nice helper class that allows me to do things like this:

        .callFactory("valueOf", classes(int.class), objects(42))
        .callFactory("valueOf", classes(int.class), objects(1337))

The ConditionalPrefabValueBuilder contains similar methods for calling constructors or referencing constants. Behind the curtains, it calls things like Class.forName(), Constructor.newInstance() and Method.invoke(). It contains a lot of try/catch blocks, too. The classes and objects methods are static imports for methods that I wrote that convert a vararg into an array. They just look a lot nicer than new Class<?>[] { int.class } would.

Joda-Time and Google Guava

While I was at it, I also added prefab values for some commonly used classes from Joda-Time and Google Guava, such as LocalTime and ImmutableList. Because, why not?

(Please note that I’m not going to add prefab values for every library out there. But since Joda-Time and Guava are so ubiquitous, I think this has real added value.)

Unit tests

In order to test all this, we can simply write a class containing some of these types, put it in a string, and run it through the compiler, much like we did with the Java 8 class above.

However, this changes one thing quite dramatically. Now that the tests are platform-dependent, they need to be run on each platform before I can release it. After all, if I run the tests only on Java 7, how will I know that I called ConditionalPrefabValueBuilder correctly with my Java 8 java.time.ZonedDateTime? I don’t, that’s how.

But, TravisCI to the rescue! TravisCI is a continuous integration service which is free for open source projects such as EqualsVerifier. I have configured it in such a way that, whenever I push something to GitHub, it triggers a build on OpenJDK 6, OpenJDK 7, Oracle JDK 7, and Oracle JDK 8. Whenever something fails, I’ll receive an e-mail within mere minutes. It’s a life-saver.


If you want to take a look at the full source: it’s all on GitHub. Here are some of the classes I discussed: ConditionalCompiler, Java8ClassTest, ConditionalPrefabValueBuilder and ConditionalInstantiator. Fork away!

So there you have it: quite possibly the biggest hack in my career so far. I’m still not sure whether to be proud or ashamed. But I’ve been working with this for several weeks now, and it works quite well! And it certainly adds value to EqualsVerifier: for me, because I don’t need to maintain separate code bases for different versions of Java. And for you, the user, because you don’t need to worry about which version of EqualsVerifier to use with your version of Java, and because you even get prefab values for Joda-Time and Google Guava as an added bonus.

On profiling

Several times now, I’ve been in a situation where we would have some performance problem, and my team mates would “know” immediately what the problem was and set to work. “No, no,” I would say, “we must profile first. It could be something else than we think!” And then they’d ask me for an example, and I couldn’t come up with one, and they wouldn’t believe me, or they would believe me but think using a profiler is too much work, and they would go ahead with their original idea, and after some work, they would find out that they gained some improvement but not nearly enough. And they would be disappointed and then I would teach them to use the profiler, and profile for 15 minutes, and find something silly to fix, and get a 75% speed increase.

I’m writing this down so next time I can’t come up with an example, I can refer back to this post.

Yesterday, I was doing my yearly foobal touch-up. One of the things I wanted to fix was a performance problem where, as the year progressed and new scores were added to the data set, the program would get increasingly slow, up to the point where I’d have to remove old data from the data set if I ever was going to get an answer from the damn thing.

Here’s how the program works:

  1. Scrape the latest match results from some website,
  2. Append them to a huge xml file with all the data,
  3. Read the xml file and convert all the elements into data objects,
  4. Inject the data objects into the rule engine,
  5. Let the rule engine do its thang,
  6. Print the answer to the screen.

There were two things that I suspected might cause the slowness. The first was reading the xml and converting it into data objects: there’s a lot of data and xml might not be the most efficient way to store that. It’s a lot of string parsing, and a csv file might serve just as well. The other one was the rule engine: I’m not an expert rule engine developer, and I know at least one of the rules I wrote is very awkwardly implemented. Given the amount of data it has to process, and the fact that I don’t even know if it performs linearly, quadratically, or even exponentially, I figured it might also be the cause of the slowness.

But which of these two suspicions was the real culprit? I had no idea, so I busted out the profiler. (By the way, did you know Oracle ships for free with the Java JDK? It’s called VisualVM and it’s actually very good. Go try it out!)

After about 10 minutes, I found out that Foobal was spending most of its time not in the rule engine. It was also spending most of its time not in the xml parser. No, it was spending most of its time in the org.joda.time.LocalDate.toDate() method. What!?

Turns out I use that method only once in my application, in the data object that goes between the xml and the rule engine. Here it is:

case class Outcome(
    homeTeam: String,
    outTeam: String,
    homeScore: Int,
    outScore: Int,
    date: LocalDate) {
  def millis: Long = date.toDate.getTime

  // other methods elided for brevity

I added the millis method because I like to use Joda-Time, but the rule engine doesn’t. Using Longs makes date comparisons a lot easier to do for the rule engine. But why does the program spend so much time there?

If you’re familiar with Scala, you will see that millis is a def, meaning that the body of the method is evaluated each time it’s called. Turns out that it gets called a lot. Like, really a lot. So I changed it into a val, making it a property which gets evaluated only once, when the object is created.

Here’s a graph of the speed-up that I got out of that.


Oh, by the way: the scale of the vertical axis is logarithmic. OMG.

So the moral of the story: ten minutes of profiling and 2 seconds of changing 3 bytes, saved me a lot of time rewriting the xml to csv or messing with a rule engine that I don’t understand fully.

Another moral of the story is that I, too, am not immune to the premature optimization bug, either. If I’d had only one supsicion, instead of two, I might never have fired up VisualVM to find out the cause was actually a third thing that I would never have come up with otherwise.

Let me finish up with Donald Knuth’s quote about premature optimization:

“Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.”