Tuesday 17 November 2009

Refactoring into Roles

So, the pipeline moves along, and then the bossman says:

'I want to be able to run it outside of the normal pipeline locations - how do I do it?'

After a discussion as to why you want to do this (since the normal person responsible for testing out new Illumina pipelines often does it in the usual place, moving softlinks as necessary), it is that he wanted to run some stuff to dump the output into his home directory.

OK, if that is what you want.

'Oh yes, also, can we make it more generic. A lot of things we have done imply the setup we have here (most of which is generated by the Illumina pipeline) and I would like it to move away from that'.

This is a major refactor then. Separate Business logic from Generic logic, and allow everything to be user defined, should the user want to.

Looking at it, it seems an obvious thing really, but at the time I have to say, I panicked a bit. The pipeline has been written to be pluggable, so new 'components' can be switched in or out quickly, but I hadn't really planned it to be used outside of the current setup. The principle is easy enough to apply elsewhere, and should not take long to set up, but the components were specialised to apply to what they represent.

So, the first thing I did - go and get a drink. I don't drink Tea or Coffee, but popping off for a break seemed like the right thing to do. This break lasted a while, as I thought about strategies to take. Should I go for all new objects, passing them around. How about pushing them from the command line? MooseX::Getopt seemed to give an option to put any attributes onto the command line, but what about subsequent component objects from the pipeline?

In the end, I wondered about using Moose Roles. The Manual seemed to suggest that this could be a good way forward. A 'class' that doesn't need to be instantiated or extended, but instead is consumed to become part of the class that you want. So, everything could have all the same attributes, in exactly the same way.

Now, I have to say, I could see an immediate danger here. If everything can be given to everything, then what makes anything different to be a component, or the pipeline 'flag waver'. So, I needed to be sensible, in that the pipelines need to match the components that are going to be launched from them, but not others.

First job: Sort out the pipelines. This was quite simple, and needed doing anyway, since the flag waver had a component launcher for every possible component, but most components are only used for 1 of 3 pipelines, so separate out the 3 pipeline components into subclasses. This had the added advantage of naming the 3 pipeline flag wavers as well.

OK, so what's the next thing.

Go through all components looking for common attributes and methods, and labelling them as generic, or specific. Once I had done this, I created some roles which I refactored these into. Sometimes leaving them in the class if they were unnecessary to be available in multiple classes, otherwise putting them in a role which described if they were generic/business logic and which described the type of feature they give.

After that, I just needed to apply roles to classes which meant that the flag waver had the same attributes as any component classes it would launch. This enables me to have the attribute value in the flag waver, and pass it to anything launched from it.

This left a headache in that new instantiation would need me to loop through all attributes and pass them through. So, I put together another role to do this (see http://vampiresoftware.blogspot.com/2009/11/moosexattributecloner.html for details on this).

So, now I'm left with the final problem, how to allow the user to run a pipeline with the options they want. A quick solution to this was already being used by another of my team, MooseX::Getopt. This is a great Role to apply to a class, which then enables any attribute to instantly become a command line option. Since the the Roles created above give attributes to a class, then they become command line options. Hurrah, problem solved.

So, after I initially thought that Roles would not really be worth it much, considering how much subclassing I have normally done and have been used to, now I'm convinced that this is a very useful technology.

Result of all this refactoring - over 900 lines of code lost. The value according to sloccount if $35,000 less. The code is now more maintainable, and more useful for users who want to define their own parameters.

Sorry Class::Std, you have just had another nail forcefully bashed into your coffin.

No comments: