Random Allsorts: 2011

20 November 2011

Central exception handling in Scala

Have you ever seen this in Java code?

public class AssignResponseImpl extends ServletSupport {
  public AssignResponse assign(Ident ident, int low) {
    try {
      return getService().assign(ident, low);
    } catch (Exception e) {
      logger.debug("caught Exception", e);
      return new AssignResponse("ERROR", e.getMessage());
    }
  }
}

This sort of java code always irritates me. I've got about 20 of these classes. They are there to glue the Axis servlets to my services.
The reason I don't like them is you can't factor them. And because you can't factor them, it's difficult to create (and maintain) standard behaviour.
Like always logging the error. And always returning the exception message in the correct place. Let's have a look at another:

public SearchResponse search(Ident ident, Criteria criteria) {
  try {
    return getService().search(ident, criteria);
  } catch (Exception e) {
    logger.debug(e);
    return new SearchResponse("ERROR", e.getMessage());
  }
}

Notice the subtle change in behaviour? No? It's in the logger. These methods are in two different unrelated services. But they are very similar, and require the
same error handling.
Now, SearchResponse and AssignResponse share a common superclass, Response. So in the case of an exception, only the class differs, the fields
we're filling in are those in Response. This doesn't really help in Java, we have to cut and paste the try catch, with only the name of the class changing.
We also have to add
in the extra constructor into the subclasses, and all they do is fill in the fields in the superclass. But in Scala, we can use
two tricks: manifests and first class functions.

Manifests allow you access to information about classes which you wouldn't normally have available in Java, in this case the
return type of the service method (A):

abstract class ServletSupport[A <: Response] {
  protected def exception(fn: => A)(implicit m: Manifest[A]): A = {
    try {
      fn
    } catch {
      case e => {
        logger.debug("caught Exception", e)
        // create new instance of A
        val t = m.erasure.newInstance().asInstanceOf[A]
        t.setResponseCode("ERROR")
        t.setMessage(e.getLocalizedMessage())
        t
      }
    }
  }
}

In our service endpoint superclass, we've defined exception, which takes as parameters a function (taking no parameters and returning
A, the return type of our target method), and an implicit Manifest parameter. Importantly, this is added by the Scala compiler, so we don't have to add
the parameter manually.
In this case, we're asking for extra information about A.

Our exception method calls the passed in function, and if there isn't an exception thrown, then it returns the value returned by the function. If there is
an exception, then the return value is a new instance of A, with the response code and message filled in. We know that we can call setResponseCode and setMessage on an A
because in the class definition, we're setting the type bounds of A, it has to extend Response. OK, so what is our calling code like now?

class AssignResponseImpl extends ServletSupport[AssignResponse] {
  def assign(ident: Ident, low: Int) = exception {
    getService().assign(ident, low)
  }
}

Now, we have standard error handling between web services; all I have to do is add a exception {} round the delegated call. In this case, we can use {} rather than ():
this means it looks like it's part of the language. And this is all type safe, I only have to specify the name of the response class once in the definition.

07 November 2011

How to inherit static methods in Scala

In our project we're using Apache Axis soap web services. With Axis, you have to define your pojos in a certain way. Not only do you need your getters and setters, you have to define a static method so that the axis libraries can find out type information to create the wsdl and static methods to serialize and deserialize to and from XML. These must be static methods, so are very hard to factor out in Java. We need to duplicate them for each POJO class.

public class WebServiceObject {
    private Integer id;

    public Integer getId() {
        return id;
    }

    public void setId(Integer id) {
        this.id = id;
    }

    // Type metadata
    public static TypeDesc getTypeDesc() {
        TypeDesc typeDesc = new TypeDesc(WebServiceObject.class,
                            true);
        typeDesc.setXmlType(new QName("to", "WebServiceObject"));
        SoapHelper.addTypeDesc(typeDesc, "id", "int", true);
        return typeDesc;
    }

    public static Serializer getSerializer(String mechType,
                                           Class<?> javaType,
                                           QName xmlType) {
        return new BeanSerializer(javaType, xmlType, typeDesc);
    }

    public static Deserializer getDeserializer(String mechType,
                                               Class<?> javaType,
                                               QName xmlType) {
        return new BeanDeserializer(javaType, xmlType, typeDesc);
    }
}

As you can see, there is a lot of boilerplate here. Some of the POJOs are large (66 attributes), so there is a *lot* of boilerplate. Is there anything we can do about this? Let's see. The first thing we can use is @BeanProperty, to get rid of the getters and setters. This annotation, which we can apply to a field, generates a getter and setter for that field.

class WebServiceObject {
  @BeanProperty var id: Integer = _
}

So already this is a lot better. But what about the static methods? We can apply a trick here. If we define a method in a companion object, then the class gets a static forwarder for that method. So I can define a getTypeDesc in the companion object, and then java can call it using the normal static way, i.e: WebServiceObject.getTypeDesc(). And we can use inheritance with companion objects. Yay!
So this is the entire code:

class WebServiceObject {
  @BeanProperty var id: Integer = _
}

object WebServiceObject extends SoapSerializer {
  val typeDesc = {
    val typeDesc = createTypeDesc(classOf[WebServiceObject],
                                  "to", "WebServiceObject")
    addTypeDesc(typeDesc, "id", "int", true)
  }
}

trait SoapSerializer {
  val typeDesc: TypeDesc

  def getTypeDesc() = typeDesc

  def getSerializer(mechType: String,
                    javaType: Class[_],
                    xmlType: QName) =
            new BeanSerializer(javaType, xmlType, typeDesc)

  def getDeserializer(mechType: String,
                      javaType: Class[_],
                      xmlType: QName) =
            new BeanDeserializer(javaType, xmlType, typeDesc)
}

So our pojo has gone from 26 java lines to 9 Scala lines. Our 66 attribute POJO was 698 lines but is now 154.
There is more we could do, but this is enough for the minute. Our code is pretty well factored, and the amount of duplication is now acceptable. More importantly, you can read the code :-)

25 October 2011

Reducing boilerplate in Scala

I've often heard claims from Scala advocates that coding in Scala rather than Java saves time, reduces the number of lines of code and clarifies the code, removes the boilerplate.
They say it makes the code easier to read.

So, as a long term project, I'm going to be translating one of our reasonably sized (~50k lines) projects from Java to Scala. I'll post something when I find something interesting.

I'm doing this to answer the following questions:

Can I use mix and match Scala and Java? This is one of the selling points of Scala. You can use Java technology and libraries easily from Scala. My project uses (old versions of) hibernate, Spring, Spring MVC. We'll see.

When I've finished, will I have fewer lines of code? Again, one of the selling points of Scala.

When I've finished, will my code be understandable? One of the points of contention of Scala is its perceived complexity. The old adage says: 'You can write Fortran in any language', but will we end up with a codebase which is unavoidably incomprehensible? Inherent complexity is one thing, accidental complexity is another, but will we have designed-in complexity?

Is the tooling up to the job? I mean in particular eclipse, maven, some of the other eclipse plugins. The Scala-IDE has improved a lot recently, but is it robust enough?

One thing I want to avoid is refactoring that could be done in Java. I want to compare well written Java code with well written Scala.
I'm going to translate class by class where possible, and then improve the code, make it more idiomatic.

I am reliably informed that the best place to start is my testing code, my junit tests. Testing code isn't delivered to the customer, you can try something and not have it affect production code.

First things first. The old project was developed using Eclipse Galileo. This is no longer an option if I'm going to use Scala, the plugin doesn't work with it.
I'll need to upgrade to Helios.
This is essentially pain free (except for some maven issues, which I'll deal with later).

The project contains some soap services (developed using Apache Axis2). We test using a stub class and we have one test per service.
When we translate the tests directly from Java to Scala, we don't usually gain very much. For example, we have a java method such as:

private Calendar getDate(String dateString) throws Exception {
    Calendar calendar = Calendar.getInstance();
    calendar.setTime(new
               SimpleDateFormat("dd.MM.yyyy").parse(dateString));
    return calendar;
}

we end up with the following Scala method:

private def getDate(dateString: String) = {
    val calendar = Calendar.getInstance();
    calendar.setTime(new
               SimpleDateFormat("dd.MM.yyyy").parse(dateString));
    calendar
}

So the only thing we've gained is the lack of a return type (which is inferred to be Calendar) and the lack of throws Exception. Scala does not have checked exceptions, we don't need it.
Some methods, however, condense down a lot.

private Set getErrorCode(ErrorTo[] errors) {
    Set set = new TreeSet();

    for (ErrorTo error: errors) {
        set.add(error.getCode());
    }

    return set;
}

In Scala, this becomes:

private def getErrorCode(errors: Array[ErrorTo]) =
                new TreeSet(errors.map(_.getCode).toSet)

There is actually quite a lot to see here. Scala is much more expressive when dealing with collections. The map() method applies a
function to every entry in a collection, in this case an Array, and returns another collection (a Seq). We're applying getCode to
each entry in the array and returning a new collection (of String). _ refers the 'current instance'. So we're converting from an Array[ErrorTo] to a Seq[String].
Seq is another Scala collection type. We convert this to a Set (a Scala Set) and populate a java.util.TreeSet, because we wish to maintain interoperability with Java. For the minute.

We're using implicit conversions to convert between Scala & Java collections. In Scala, we can define an implicit conversion between two classes
so that if we want one of them but have the other, the classes get converted magically. So the toSet function returns a scala Set.
But java.util.TreeSet doesn't have a constructor which accepts a Scala Set, so we have to convert it. We have to import scala.collection.JavaConversions._

import scala.collection.JavaConversions._

These implicit conversions can be a performance problem sometimes, because you're potentially converting between objects multiple times, but we don't care about them here,
because this is testing code :-).

Why is Scala so much more concise than Java here? One reason is the type inference. In the java method, we mention Set three times, in Scala only once.
That, the map() function and the lack of a return statement in Scala reduces a 7 line java method down to a single line. It can be on a single line, so it goes on a single line. Because we can.

Next, we'll look at how we can use static methods and how to inherit them.

01 October 2011

Using git svn with a large repository

I've started using the git svn bridge for one of our projects, but I had a couple of problems with the initial clone of the repository, due to the file size
(some > 100Mb), and to the subversion server dropping the connection.
So, I started using the standard git svn clone:

$ git svn clone https://svn.farwell.co.uk/svn/project --stdlayout
Initialized empty Git repository in c:/code/project/.git/
r1 = 339bd134b2d482cf9038c16fa75f93255ebfbc1a (refs/remotes/trunk)
W: +empty_dir: trunk/blah1
W: +empty_dir: trunk/blah2
W: +empty_dir: trunk/blah3
W: +empty_dir: trunk/blah4
....

The --stdlayout means that git expects the trunk to be called trunk, tags be called tags and branches to be called branches.
Note also that you need to specify the url without the trunk at the end. This ran for a while, and then fell over, because svn dropped the connection on me. There is a timeout on the server.

RA layer request failed: REPORT request failed on '/svn/project
/!svn/vcc/default': REPORT of '/svn/project/!svn/vcc/default': 
Could not read chunk delimiter: Secure connection truncated (ht
tps://svn.farwell.co.uk) at C:\Program Files (x86)\Git/libexec/
git-core/git-svn line 5114

We need to load in batches. git fetch has a -r option to allow you to specify the range of revisions to fetch. We've got some large files, so we'll do 10 at a time.
I started again:

$ git svn clone https://svn.farwell.co.uk/svn/project \
                     --stdlayout -r1:2

which fetched the first two revisions, but we have to fetch the rest, about 1000 revisions. I used a quick perl script.

my $count = 1;

while ($count <= 1000) {
    # executes git svn fetch -r1:11 etc.
    my $cmd="git svn fetch -r$count:" . ($count + 10);
    print "$cmd\n";
    system($cmd);
    $count += 10;
}

But then we get another problem: git is running out of memory; it crashed and this time it's more serious. Another problem with our big files. This is the error message:

Out of memory during "large" request for 268439552 bytes, total sbrk() is 140652544 bytes at /usr/lib/perl5/site_perl/Git.pm line 898,  line 3.

Git svn uses perl to download and process the files, but it slurps the entire file in one go. So for our large files, it runs out of memory.

After a bit of searching on the internet, I found a solution on github for our problem: Git.pm: Use stream-like writing in cat_blob().
This is a fairly simple patch, which doesn't seem to have made it into a release yet, so I applied it manually to C:\Program Files (x86)\Git\lib\perl5\site_perl\Git.pm.

@@ -896,22 +896,26 @@ sub cat_blob {
   }
   my $size = $1;
-
-  my $blob;
   my $bytesRead = 0;

   while (1) {
+    my $blob;
     my $bytesLeft = $size - $bytesRead;
     last unless $bytesLeft;

     my $bytesToRead = $bytesLeft < 1024 ? $bytesLeft : 1024;
-    my $read = read($in, $blob, $bytesToRead, $bytesRead);
+    my $read = read($in, $blob, $bytesToRead);
     unless (defined($read)) {
       $self->_close_cat_blob();
       throw Error::Simple("in pipe went bad");
     }

     $bytesRead += $read;
+
+    unless (print $fh $blob) {
+      $self->_close_cat_blob();
+      throw Error::Simple("couldn't write to passed in filehandle");
+    }
   }

   # Skip past the trailing newline.

@@ -926,11 +930,6 @@ sub cat_blob {
     throw Error::Simple("didn't find newline after blob");
   }

-  unless (print $fh $blob) {
-    $self->_close_cat_blob();
-    throw Error::Simple("couldn't write to passed in filehandle");
-  }
-
   return $size;
}

I restarted the process from the beginning and voilà, it got to the end. All of the revisions had been fetched, all that was left to do was a

$ git svn rebase

to merge the changes into the tree and have a working git repo.

If had wanted to migrate from svn to github, rather than continue to use git svn, I'd have done exactly the same thing, but add a --no-metadata to the clone command.
And obviously you don't need to to an svn rebase, just a rebase.

12 April 2011

Playing with Restructure 101 - Java

A couple of months ago, I went to the JUGL (Java User Group Lausanne), at which they had a comparison between various packages for analyzing quality, including Sonar, Coverity, Jtest, Xdepend and Restructure 101.

I decided to play with some of them and really try them out. So we're starting with Restructure 101.

Please note that I am in no way affiliated with Headway Software.

Restructure 101/Structure 101 is a package which analyzes your code base in terms of complexity, along two axes, which they call FAT and Tangle. A package/jar/whatever is fat if there are lots of classes in it. Your code is tangled if there are loops in your dependency graph, or if there are too many dependencies between packages. For instance, if package a depends on package b which in turn depends on a, you've got a loop. This is considered bad.

So, as a test I'm using a single jar from one of our projects. (Java 1.4, Hibernate 2). Even before I start this, I know it's badly structured and hard to follow, but what's the best way to improve this?

In Restructure 101, you import the jar into a new project and it shows you:

So we can see in the top left, we've got a fairly tangled codebase, but not too fat. (You can adjust the parameters).

By double clicking on a package, you expand it, and you start to see the links between packages. Do we drill down a bit, looking for tangles

So we look for stuff with a red background. The blue lines indicate the dependencies which cause tangles. After a minute or so, we have the following:

We seem to have found it. So, what does this diagram mean? In fact, we have a package bo, and a sub-package bo.impl, and a corresponding modele and modele.impl. The bo contains the base interfaces and abstract classes and modele the model interfaces and concrete classes. So what can we do about this? Restructure 101 allows us to drag and drop from one package to another. So we'll take the interfaces from bo and put them into modele and the abstract classes from bo.impl to modele.impl.

which looks a lot better, and I think is a lot easier to understand. All of my model is in one place now. It's a pretty big package (56 classes), but acceptable because all the classes are the same type of object. When I'm in this package, I don't need to think. This improved clarity of architecture is reflected in the complexity indicator:

which has moved in the right direction (down). We'll stop there.

This is all very nice, but what can we do with this information? Restructure 101 gives us a number of options. The first and probably most useful is to export the list of actions made in Restructure 101 into the Eclipse plugin. I can't do this because I've only got an evaluation license, so I can't tell you whether or not it works :-). But even without the export, I can go to Eclipse and know what to do.

So, for my test jar, I found Restructure 101 useful, without even going very deep into the functionality. You can do a lot more, including publishing to a repository, for changes to be picked up by the eclipse plugin, and there is a sonar plugin.

Next, I wonder how this will react to a Scala project.

27 February 2011

Replacing properties with a groovy script

I've used groovy in a interesting way recently.

We were writing a Spring/JSF application for a client which managed a set of accounts for the local council. This would send invoices to another system, using a proprietary file format. Part of the file sent contained the text that would be printed out and sent to the client (of the council), including text which could be changed by the council, such as email addresses, phone numbers.

So we had two problems:

1) How do we nicely provide a way for the council to change the names, email addresses and phone numbers on the final invoice? We could have 20 different properties defined, but this is a nightmare.
2) We're talking about formatting invoices, during testing we would have a lot of to-ing a fro-ing with the client, how do we minimize this?

Our solution: use a groovy script which is invoked from java. The script is delivered in the same directory as the other properties (so the client can edit it). We can also modify it and redeploy rapidly when client changes their mind without having to redeploy the entire application.

In the server, we have a java method which invokes the groovy script. To keep this simple, the groovy script we set as input a variable called invoice (an object) and defines a variable called output, a String, which represents the text that will end up on the invoice sent to the user. Here is the java which calls the script:


package uk.co.farwell;

import groovy.lang.Binding;
import groovy.util.GroovyScriptEngine;

import org.codehaus.groovy.control.CompilerConfiguration;

public final class RunGroovyTemplate {
      public String runTemplate(String root, String script, String encoding, Invoice invoice) throws Exception {
            String[] roots = new String[] { root };
            GroovyScriptEngine gse = new GroovyScriptEngine(roots);
            Binding binding = new Binding();
            binding.setVariable("invoice", invoice);

            CompilerConfiguration compilerConfiguration = new CompilerConfiguration();
            compilerConfiguration.setSourceEncoding(encoding);
            gse.setConfig(compilerConfiguration);
            gse.run(script, binding);

            return binding.getVariable("output").toString();
      }
}

Nothing very complex there. And this method is called with something like:


String output = new RunGroovyTemplate().run("C:\\temp", "template.groovy", "UTF-8", invoice);

System.out.println("output=\n\n" + output);

Again, nothing complex. Ok, so what does the groovy look like?


output = "Invoice number: ${invoice.id}\n\n"

for (line in invoice.lines) {
     output = output + " ${line.description}  ${line.total}\n"
}

output = output + "-------------------\n";
output = output + "total         ${invoice.total}\n";

output = output + """
--
Please inform us of any changes :
billing address or name(s)

Contact details:

For questions about this invoice:
Fred Bloggs:  021 454 67 78
E-mail: fred.bloggs@foo.com
"""

So you can see that we've not got a lot of Java, but we've built in a lot of flexibility by using groovy. I've carefully separated out the contact details so that they can be changed easily on site.

The output from this looks like:


Invoice number: 66

 description  34.00
-------------------
total         34.00

--
Please inform us of any changes :
billing address or name(s)

Contact details:

For questions about this invoice:
Fred Bloggs:  021 454 67 78
E-mail: fred.bloggs@foo.com

The disadvantage of this approach are that we are delivering a script to a client, which isn't always a good thing to do, but we thought that the advantages of this approach outweighed the disadvantages.

And don't forget that this can be unit tested just as easily as a pure Java solution.

21 February 2011

Backup, backup backup

I was having problems with my Dell Precision M4500. It wasn't booting:

Windows failed to start. A recent hardware or software change might be the cause. To fix the problem:

1. Insert your windows installation disc and restart your computer.
2. Choose your language settings, and then click "Next".
3. Click "Repair your computer"

If you do not have this disc, contact your administrator or computer manufacturer for assistance.

File: \Status: 0xc000000f

Info: An error occured while attempting to read the boot configuration data.

After panicking that my hard disk was broken, I discovered that this was a problem with the windows boot manager.

So, I made a Windows recovery disk from another Windows 7 machine I had, and tried to boot from that. Didn't work. Same problem. Ah, OK, we'll switch off booting from the hard disk, just boot from the CD. F2 to enter setup on the Dell Precision, sorted.

It took an awful long time to boot from the CD, it's always worrying when that happens. But it boots. The CD says that it's a problem with the boot manager, do you want me to fix it. I say yes. And it does work. It reboots successfully.

First thing to do, install Crashplan and do a backup. Just in case.

Then I switched the boot from disk back on, and reboot. It works. Woohoo.

Moral of the story (two actually):

Make a windows recovery disk, before you need it.
Backup, and test your backups. I've started to use Crashplan (http://www.crashplan.com/). It's great.