Saturday 12 December 2009

Speedy does it.

In order to extend the flexibility to the pipeline, I developed MooseX::AttributeCloner (see CPAN) in order to pass around variables set on the original command line.

However, this has led to the need to redevelop some further loading scripts that had been originally written with Class::Std and Class::Accessor. Switching them to Moose has had a two-fold effect that I have been rather happy with.

1) Less code. Since the Roles that I had written for the pipeline (but made sure I left as potentially 'common code'), consuming these ditched a substantial amount of code. Woohoo!

2) Faster. The loading of the data runs exceptionally fast now. The dropping of the need to generate extra objects (since code was refactored into consumed roles) has increased the speed of data lookups. I haven't truly benchmarked the loading of the data, but something that was taking a couple of minutes to run through completely, now takes mere seconds. In fact, the logs don't show any change in the timestamp used in each of the print statements.

The longer of the two refactored scripts had a reduction of nearly half the code. The shorter about 45%. The slow parts (conversion to xml and database loading) of the long script needs some refactoring of how it loads. (drop xml mid stage) but now it seems that some improvements have certainly been achieved with the switch to Moose.

Another factor which has also been able to speed this up is the fact that, if the pipeline has told the scripts lots of information (such as file system paths, filenames, etc) then the code doesn't need to work it out again. A definite advantage of the combination of ustilising MooseX::Getopt and MooseX::AttributeCloner.

No comments: