25 October 2011

Reducing boilerplate in Scala

I've often heard claims from Scala advocates that coding in Scala rather than Java saves time, reduces the number of lines of code and clarifies the code, removes the boilerplate.
They say it makes the code easier to read.


So, as a long term project, I'm going to be translating one of our reasonably sized (~50k lines) projects from Java to Scala. I'll post something when I find something interesting.

I'm doing this to answer the following questions:
  • Can I use mix and match Scala and Java? This is one of the selling points of Scala. You can use Java technology and libraries easily from Scala. My project uses (old versions of) hibernate, Spring, Spring MVC. We'll see.

  • When I've finished, will I have fewer lines of code? Again, one of the selling points of Scala.

  • When I've finished, will my code be understandable? One of the points of contention of Scala is its perceived complexity. The old adage says: 'You can write Fortran in any language', but will we end up with a codebase which is unavoidably incomprehensible? Inherent complexity is one thing, accidental complexity is another, but will we have designed-in complexity?

  • Is the tooling up to the job? I mean in particular eclipse, maven, some of the other eclipse plugins. The Scala-IDE has improved a lot recently, but is it robust enough?

One thing I want to avoid is refactoring that could be done in Java. I want to compare well written Java code with well written Scala.
I'm going to translate class by class where possible, and then improve the code, make it more idiomatic.

I am reliably informed that the best place to start is my testing code, my junit tests. Testing code isn't delivered to the customer, you can try something and not have it affect production code.

First things first. The old project was developed using Eclipse Galileo. This is no longer an option if I'm going to use Scala, the plugin doesn't work with it.
I'll need to upgrade to Helios.
This is essentially pain free (except for some maven issues, which I'll deal with later).

The project contains some soap services (developed using Apache Axis2). We test using a stub class and we have one test per service.
When we translate the tests directly from Java to Scala, we don't usually gain very much. For example, we have a java method such as:
private Calendar getDate(String dateString) throws Exception {
Calendar calendar = Calendar.getInstance();
calendar.setTime(new
SimpleDateFormat("dd.MM.yyyy").parse(dateString));
return calendar;
}
we end up with the following Scala method:
private def getDate(dateString: String) = {
val calendar = Calendar.getInstance();
calendar.setTime(new
SimpleDateFormat("dd.MM.yyyy").parse(dateString));
calendar
}
So the only thing we've gained is the lack of a return type (which is inferred to be Calendar) and the lack of throws Exception. Scala does not have checked exceptions, we don't need it.
Some methods, however, condense down a lot.
private Set getErrorCode(ErrorTo[] errors) {
Set set = new TreeSet();

for (ErrorTo error: errors) {
set.add(error.getCode());
}

return set;
}
In Scala, this becomes:
private def getErrorCode(errors: Array[ErrorTo]) =
new TreeSet(errors.map(_.getCode).toSet)
There is actually quite a lot to see here. Scala is much more expressive when dealing with collections. The map() method applies a
function to every entry in a collection, in this case an Array, and returns another collection (a Seq). We're applying getCode to
each entry in the array and returning a new collection (of String). _ refers the 'current instance'. So we're converting from an Array[ErrorTo] to a Seq[String].
Seq is another Scala collection type. We convert this to a Set (a Scala Set) and populate a java.util.TreeSet, because we wish to maintain interoperability with Java. For the minute.

We're using implicit conversions to convert between Scala & Java collections. In Scala, we can define an implicit conversion between two classes
so that if we want one of them but have the other, the classes get converted magically. So the toSet function returns a scala Set.
But java.util.TreeSet doesn't have a constructor which accepts a Scala Set, so we have to convert it. We have to import scala.collection.JavaConversions._
import scala.collection.JavaConversions._
These implicit conversions can be a performance problem sometimes, because you're potentially converting between objects multiple times, but we don't care about them here,
because this is testing code :-).

Why is Scala so much more concise than Java here? One reason is the type inference. In the java method, we mention Set three times, in Scala only once.
That, the map() function and the lack of a return statement in Scala reduces a 7 line java method down to a single line. It can be on a single line, so it goes on a single line. Because we can.

Next, we'll look at how we can use static methods and how to inherit them.

2 comments:

Hubert Behaghel said...

Hi Matthew,

Thanks for this post I am really looking forward to your feedback on migrating a decent java codebase to scala.

Will your attempt strictly deal with implementation, or do you envision some architectural moves? (some ideas in which I have curiosity: layers becoming type classes, enforcing immutability, IoC by native scala constructs...)

Matthew Farwell said...

Initially, I'll look at implementation, more of a direct translation from Java to Scala. Then I'll start the fun stuff, removing hibernate and spring, immutability etc.