tr ouwens

by the way: things I want to say

The things we do for compatibility

For EqualsVerifier’s new 1.5 release, I faced a dilemma. EqualsVerifier should support Java 8, but it also should still run under Java 6 and 7. Preferably in a single code base, because maintaining multiple code bases is a hassle (even if it means I get to use lambdas in one of them). Also, there should be unit tests targeting Java 8-specific classes: does EqualsVerifier support classes that contain lambdas? does EqualsVerifier support classes with fields of type, say, java.time.ZonedDateTime? These tests should run on Java 8 but should not break on Java 6. Can this even be done?

The answer is: yes. Yes, it can be done. Here’s how.

Compiling

It turns out that Java 6 introduced the javax.tools.JavaCompiler interface. You can use it, usurprisingly, to compile Java classes at runtime, like so:

private void compileClass(File sourceFile, File tempFolder) throws IOException {
    JavaCompiler compiler = ToolProvider.getSystemJavaCompiler();
    StandardJavaFileManager fileManager = null;
    try {
        fileManager = compiler.getStandardFileManager(null, null, null);
        fileManager.setLocation(StandardLocation.CLASS_OUTPUT, Arrays.asList(tempFolder));
        Iterable<? extends JavaFileObject> javaFileObjects = fileManager.getJavaFileObjectsFromFiles(Arrays.asList(sourceFile));
        CompilationTask task = compiler.getTask(null, fileManager, null, null, null, javaFileObjects);

        boolean success = task.call();
        if (!success) {
            throw new AssertionError("Could not compile the class");
        }
    }
    finally {
        if (fileManager != null) {
            fileManager.close();
        }
    }
}

Note that we can’t use try-with-resources because of EqualsVerifier’s Java 6 compatibility requirement.

This code assumes that sourceFile is a File reference to a Java source file. It will compile the file and write it to the tempFolder. The filename will be identical to the source file, but with a .class extension instead of a .java extension.

Also, any compile errors are written to the console. Not ideal, but I haven’t tried yet to redirect them so I can show the in the AssertionError somehow. I might do that for a future revision though.

Loading

So, now we have a .class file somewhere on our filesystem. However, it’s not on the classpath yet, the JVM doesn’t automagically load it, and we don’t have a Class<?> variable referencing it. So how do we use it? This is where java.net.URLClassLoader comes in:

private URLClassLoader createClassLoader(File tempFolder) {
    try {
        URL[] urls = { tempFolder.toURI().toURL() };
        return new URLClassLoader(urls);
    }
    catch (MalformedURLException e) {
        throw new AssertionError(e);
    }
}

Note that, as of Java 7, the URLClassLoader implements Closeable, which means it has a close() method that needs to be called when we’re done. It’s not Closeable yet in Java 6, so we’ll have to call close() using reflection. I’ll leave it as an exercise to you, my esteemed reader, to figure out how to do that.

The important thing is: now we have a class loader that we can use to load our class and pass it to EqualsVerifier:

Class<?> type = createClassLoader(tempFolder);
EqualsVerifier.forClass(type).verify();

Note that this code adds the entire contents of the tempFolder file to the classpath, so it’s wise to create a fresh, empty directory for this. Since I use this code only in unit tests, I use JUnit’s TemporaryFolder rule to manage this.

Tying it together

The unit test simply contains a raw String:

private static final String JAVA_8_CLASS =
    "\nimport java.util.List;" +
    "\nimport java.util.Objects;" +
    "\n" +
    "\npublic final class Java8Class {" +
    "\n    private final List<Object> objects;" +
    "\n    " +
    "\n    public Java8Class(List<Object> objects) {" +
    "\n        this.objects = objects;" +
    "\n    }" +
    "\n    " +
    "\n    public void doSomethingWithStreams() {" +
    "\n        objects.stream().forEach(System.out::println);" +
    "\n    }" +
    "\n    " +
    "\n    @Override" +
    "\n    public boolean equals(Object obj) {" +
    "\n        if (!(obj instanceof Java8Class)) {" +
    "\n            return false;" +
    "\n        }" +
    "\n        return objects == ((Java8Class)obj).objects;" +
    "\n    }" +
    "\n    " +
    "\n    @Override" +
    "\n    public int hashCode() {" +
    "\n        return Objects.hash(objects);" +
    "\n    }" +
    "\n}";

We can write this String to a java.io.File (in the same tempFolder directory mentioned above), making sure that it’s name is the name of the class with a .java extension. In this case, that would be Java8Class.java. Then we pass the File reference to the compileClass method defined above, and the circle is complete.

Java 6

Now what about Java 6? We haven’t used any API calls that aren’t available in Java 6, so that’s good, but obviously the Java8Class string won’t compile. We don’t want our test to fail on that. We can solve this by simply detecting if the test is running on a Java 8 JVM, and if it’s not, simply return. How we do this? Well…

public boolean isTypeAvailable(String fullyQualifiedTypeName) {
    try {
        Class.forName(fullyQualifiedClassName);
        return true;
    }
    catch (ClassNotFoundException e) {
        return false;
    }
}

// ...

if (!isTypeAvailable("java.util.Optional")) {
    return;
}

java.util.Optional was introduced in Java 8, so if it’s on the classpath, we know we’re running Java 8 (or higher). It’s a bit of a hack, I know, but to me it felt more reliable than checking Java system properties. And the whole thing is obviously a huge hack anyway, so what’s one more, right? :)

Java 8 API classes

So that takes care of classes containing lambdas, streams, and other Java 8 language features. But we’re not done yet, because what about classes containing fields of a type that wat introduced in the Java 8 API? For example, Java 8 introduced the new Java Time API, and some other new classes as well (such as Optional which we abused above). Some of these are defined recursively, meaning EqualsVerifier can’t instantiate them without a little help, so we need to find a way to instantiate these classes and add them to EqualsVerifier’s prefabValues.

We know in advance which classes we need to add to EqualsVerifier’s prefab values (we can simply try them out and make an inventory list), and we also know how to instantiate them (that’s part of the API, after all). we just can’t call the constructor directly, because the class may or may not be on the classpath, depending on the JVM version currently running. Reflection to the rescue!

It turns out there are 3 main ways an instance of a class can be retrieved: through calling its constructor, through calling a static factory method defined on the same class, or through referencing a static constant defined on the class. Since reflection is even more verbose than vanilla Java, I’ve hidden all this away in a nice helper class that allows me to do things like this:

ConditionalPrefabValueBuilder.of("java.lang.Integer")
        .callFactory("valueOf", classes(int.class), objects(42))
        .callFactory("valueOf", classes(int.class), objects(1337))
        .addTo(prefabValues);

The ConditionalPrefabValueBuilder contains similar methods for calling constructors or referencing constants. Behind the curtains, it calls things like Class.forName(), Constructor.newInstance() and Method.invoke(). It contains a lot of try/catch blocks, too. The classes and objects methods are static imports for methods that I wrote that convert a vararg into an array. They just look a lot nicer than new Class<?>[] { int.class } would.

Joda-Time and Google Guava

While I was at it, I also added prefab values for some commonly used classes from Joda-Time and Google Guava, such as LocalTime and ImmutableList. Because, why not?

(Please note that I’m not going to add prefab values for every library out there. But since Joda-Time and Guava are so ubiquitous, I think this has real added value.)

Unit tests

In order to test all this, we can simply write a class containing some of these types, put it in a string, and run it through the compiler, much like we did with the Java 8 class above.

However, this changes one thing quite dramatically. Now that the tests are platform-dependent, they need to be run on each platform before I can release it. After all, if I run the tests only on Java 7, how will I know that I called ConditionalPrefabValueBuilder correctly with my Java 8 java.time.ZonedDateTime? I don’t, that’s how.

But, TravisCI to the rescue! TravisCI is a continuous integration service which is free for open source projects such as EqualsVerifier. I have configured it in such a way that, whenever I push something to GitHub, it triggers a build on OpenJDK 6, OpenJDK 7, Oracle JDK 7, and Oracle JDK 8. Whenever something fails, I’ll receive an e-mail within mere minutes. It’s a life-saver.

Conclusion

If you want to take a look at the full source: it’s all on GitHub. Here are some of the classes I discussed: ConditionalCompiler, Java8ClassTest, ConditionalPrefabValueBuilder and ConditionalInstantiator. Fork away!

So there you have it: quite possibly the biggest hack in my career so far. I’m still not sure whether to be proud or ashamed. But I’ve been working with this for several weeks now, and it works quite well! And it certainly adds value to EqualsVerifier: for me, because I don’t need to maintain separate code bases for different versions of Java. And for you, the user, because you don’t need to worry about which version of EqualsVerifier to use with your version of Java, and because you even get prefab values for Joda-Time and Google Guava as an added bonus.

On profiling

Several times now, I’ve been in a situation where we would have some performance problem, and my team mates would “know” immediately what the problem was and set to work. “No, no,” I would say, “we must profile first. It could be something else than we think!” And then they’d ask me for an example, and I couldn’t come up with one, and they wouldn’t believe me, or they would believe me but think using a profiler is too much work, and they would go ahead with their original idea, and after some work, they would find out that they gained some improvement but not nearly enough. And they would be disappointed and then I would teach them to use the profiler, and profile for 15 minutes, and find something silly to fix, and get a 75% speed increase.

I’m writing this down so next time I can’t come up with an example, I can refer back to this post.

Yesterday, I was doing my yearly foobal touch-up. One of the things I wanted to fix was a performance problem where, as the year progressed and new scores were added to the data set, the program would get increasingly slow, up to the point where I’d have to remove old data from the data set if I ever was going to get an answer from the damn thing.

Here’s how the program works:

  1. Scrape the latest match results from some website,
  2. Append them to a huge xml file with all the data,
  3. Read the xml file and convert all the elements into data objects,
  4. Inject the data objects into the rule engine,
  5. Let the rule engine do its thang,
  6. Print the answer to the screen.

There were two things that I suspected might cause the slowness. The first was reading the xml and converting it into data objects: there’s a lot of data and xml might not be the most efficient way to store that. It’s a lot of string parsing, and a csv file might serve just as well. The other one was the rule engine: I’m not an expert rule engine developer, and I know at least one of the rules I wrote is very awkwardly implemented. Given the amount of data it has to process, and the fact that I don’t even know if it performs linearly, quadratically, or even exponentially, I figured it might also be the cause of the slowness.

But which of these two suspicions was the real culprit? I had no idea, so I busted out the profiler. (By the way, did you know Oracle ships for free with the Java JDK? It’s called VisualVM and it’s actually very good. Go try it out!)

After about 10 minutes, I found out that Foobal was spending most of its time not in the rule engine. It was also spending most of its time not in the xml parser. No, it was spending most of its time in the org.joda.time.LocalDate.toDate() method. What!?

Turns out I use that method only once in my application, in the data object that goes between the xml and the rule engine. Here it is:

case class Outcome(
    homeTeam: String,
    outTeam: String,
    homeScore: Int,
    outScore: Int,
    date: LocalDate) {
  
  def millis: Long = date.toDate.getTime

  // other methods elided for brevity
}

I added the millis method because I like to use Joda-Time, but the rule engine doesn’t. Using Longs makes date comparisons a lot easier to do for the rule engine. But why does the program spend so much time there?

If you’re familiar with Scala, you will see that millis is a def, meaning that the body of the method is evaluated each time it’s called. Turns out that it gets called a lot. Like, really a lot. So I changed it into a val, making it a property which gets evaluated only once, when the object is created.

Here’s a graph of the speed-up that I got out of that.

Speed-up

Oh, by the way: the scale of the vertical axis is logarithmic. OMG.

So the moral of the story: ten minutes of profiling and 2 seconds of changing 3 bytes, saved me a lot of time rewriting the xml to csv or messing with a rule engine that I don’t understand fully.

Another moral of the story is that I, too, am not immune to the premature optimization bug, either. If I’d had only one supsicion, instead of two, I might never have fired up VisualVM to find out the cause was actually a third thing that I would never have come up with otherwise.

Let me finish up with Donald Knuth’s quote about premature optimization:

“Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.”

Comparing ints and shorts in C#

My colleague Ralph and I recently discovered an interesting bit of C# equality behaviour. Consider the following piece of code:

[Test]
public void Equals()
{
    int i = 42;
    short s = 42;

    Assert.IsTrue(i == s);
    Assert.IsTrue(s == i);
    Assert.IsTrue(i.Equals(s));
    Assert.IsTrue(s.Equals(i)); // fails
}

One would expect the last assert to pass, just like the others. The fact that it doesn’t, tells us two things:

  • The behaviour of == and Equals is different on C#’s primitive integral types. At least, when comparing objects of (slightly) different types.
  • Equals is not symmetric when comparing ints with shorts.

Both are surprising to me. While I don’t like pointing out bugs in things like the .NET Framework (it’s like farting: he who points it out, is usually at fault himself), these do seem to qualify; especially the a-symmetry in Equals, which violates the contract.

There’s probably some arcane piece of .NET legacy at work here, so of course, I sent a question to Eric Lippert. I hope I’ll get a response; I’ll post an update if and when I do.

In the mean time: can you, dear readers, offer an explanation?

A better way of typing

I’m happy! I’m happy because I just found out how to get a Compose key on my shiny MacBook. I want you to be happy too, so I’ll show you how you can get one, too. I’ll even show you how to do it on Windows and Ubuntu. But first, I will explain why this makes me so happy. (If you don’t care about the why, just click here and go straight to the how. But you’ll miss a perfectly good geek-out.)

Why?

I’m a language geek. I like to spell words correctly. Words such as naïveté. Actually, I’m also Dutch, so I have to deal with words like coëfficiënt and of course and (which, really, are two very distinct words). I speak French (or should I say: français) and I live in a country where we pay with . And if that’s not enough, I like to spell people’s names correctly. I’m a software engineer; we’re a pretty international bunch. I’ve worked with people called René, Radovanović, and even Enikő. Yes, that’s an o with a double accent aigu. It’s Hungarian. Try finding that key on your keyboard.

I want it to be easy to type these special characters. I also want it to not get in the way.

Windows has this nifty trick. If you use the US International keyboard layout, you can press a " followed by an e and you automatically get an ë. Sounds nice in theory, but for me, it gets in the way. I’m a programmer; I need to be able to type things like String vowel = "e";. And when I do, I don’t want that to show up like String vowel = ë";. That’s just annoying. On a day to day basis, I have to type "e much more often than I have to type ë.

Of course, if I type a space after I type ", I get my precious ", and I can then type an e and it won’t turn into an ë anymore. I know people, programmers like me even, who have this extra key stroke ingrained in their muscle memory. If they work on a computer where this option is disabled, they type things like String vowel = " e";. Notice the extra space? Now I have nothing against these people, but this is just plain stupid. It’s like hitting your face every five minutes hoping to catch a fly.

OS X has something slightly smarter. To get a special character, you press a special key combination. For example, to get an é, you press ⌥E, followed by e. If you want an ë, you press ⌥U, followed by e. The problem with those combinations is that they’re pretty arbitrary: e for ´, u for ¨, and i for ˆ. In fact, I can never remember what key corresponds to what symbol. It’s not easy.

Also, both methods only support a very limited set of special symbols. Enikő is out of luck; she has to hunt through the Character Map program to spell her name on a non-Hungarian keyboard. If you think that’s too exotic, then you should realise that isn’t directly supported on many systems, either.

I’m not even going to mention Alt codes.

Surprisingly, Linux, for all its usability-issues, has an easy and elegant solution to this mess: the Compose key. You pick a key you don’t use often (I like the Menu key: nobody uses that button anyway) which becomes the Compose key. If you want to enter a special symbol, you hit this button, followed by the two (or more) symbols that you want to ‘compose’. For example, Compose followed by " followed by e becomes ë. Compose followed by = followed by o becomes ő, and Compose followed by = followed by e becomes . Easy!

And it goes even further than that. Combine - and >, and you get an arrow: , T and M become , < and 3 become , and, I was surprised to find out while researching this article, Compose-C-C-C-P becomes . Seriously.

I like this. It doesn’t get in the way at all. I’m free to type String vowel = "e"; without any ë’s showing up. Also, it’s super easy. In fact, it’s so easy that sometimes when I’m bored, I amuse myself by trying out various combinations of keys to see what comes out. How would you type Æ, ¿ or ©?

By now, I have probably convinced you that you want to have a Compose key, too. So how do you get one? It’s easy.

How?

  • OS X: Install the US custom keyboard layout: download and installation instructions are here.
    Note that MacBooks don’t have a Menu key. This tool uses the § key instead, which is fine, because who uses it anyway? And even if you do, it’s a Compose-s-o away.
  • Windows: Install this nifty little program.
  • Ubuntu: Go to System Settings → Keyboard Layout → Options → Compose key position.
  • Any other Linux? Then you probably already know how to do this.

Enjoy a better way of typing! ☺