Friday, 20 June 2008

Class::Std or Blessed Hash

Objects, Objects, Objects

Everything is objects these days, well, certainly in the world of agile, well structured, extensible, easy to maintain BioInformatics software.

Even Perl6 is aiming to be OO. Probably because of the fact that so many of the modules on CPAN at least expose an OO layer, if not are only OO.

When I started programming PERL, I was writing straight forward top to bottom scripts.

I then moved on to using and producing code in modules, but just exporting the subroutines into the script that used it, for simple code reuse.

Last summer, I got finally taught with hands on development of exactly how OO works and is used. I got a bit confused, but at least I had none of the confusion of

$him = Person->new({args});
$her = $him->new({args});

Which implies a relationship which 'is not there'.

Last summer I discovered Class:Std, which I think is probably my favourite CPAN module of all time. Why?

Well this is the thing. PERL is not an OO language, and it isn't slower because of it. I also learnt Ruby on Rails (as I mentioned in a previous post) and Ruby is slower because everything is an object. Something that clearly sets the two languaged apart.

Now, that isn't the thing that bugs me about OO. In fact, I have learned to embrace PERL OO, and enjoy programming in it. But what does bug me, is that the vast majority of PERL OO breaks encapsulation because all most objects are are HASHes. You have a new constructor, which blesses the package name around a HASH reference. So, when all is said and done, whilst good packages have constructors written to expose the stored data within the object via method calls, you can just access a lot of it via a key.

$him->eye_colour() is equivalent to $him->{eye_colour}

and this encourages lazy programming, because the other advantage is that you can just say 'I need to store some data, what should I do with it, as Person doesn't have an address accessor'

Now, presumably Person does have something that links it to Address. Perhaps Address and Person both have an id_person accessor. But you can cheat. If you want to grab address now, and cache it for later, just do

$person->{address} = $address->house_and_street();

The you can drop the address object, and person now knows exactly where they live.

However, this is dangerous, because

1) Have you deleted something specifically stored in key address
2) What if they move whilst person object is still in memory. You have two places to correct the data.

Why, I hear you cry - I won't do that with my program. No, but someone else will (or you will forget).

Solution use Class::Std;

Class:Std enforces encapsulation. You still get a blessed package, but this time it is a SCALAR, which can't have keys.

You then in each package declare what accessors you want the object to have, and as such enforce people to only use those accessors. You don't have to worry about AUTOLOAD in the history of used modules, as Class::Std handles creating you accessors. You don't even need a new constructor, although you can add a BUILD method which will operate at construction.

So in my example

package Person;

use Class::Std;

my %eye_colour_of :ATTR( 'init_arg' => eye_colour, :get<eye_colour>, :set<eye_colour>);

my $him = Person->new({eye_colour => 'blue'});

Job done. Less code for initial construction than blessing via new, and you cannot be tempted to throw the address onto the person when it is being used, as

print $him = Person=SCALAR(0x9f2c68)

So, unless you specify in the code (documented and tested, of course) that you want an accessor which allows this object to store the address, it can't be done, and your later code is more robust for it.

Now, where am I going with all of this?

Well, I use Clearpress to form a base for my PERL web apps in my current role. It is a good solid platform which I have mentioned before, and I am very happy to work within it. However, I am writing an API to use the services it provides. Clearpress doesn't use Class::Std. My API does. This is no problem as they talk via LWP::UserAgent requests, but it is quite confusing as the live in the same project in subversion. And my big thing is that I am programming both at the same time. This is bad news, as I have been trying to use features of one type of Object with the other. It hasn't really made a significant difference, as the package name reminds me which I should be using, but is is wierd getting the error when you try to cheat, and use a key to cache some info in the Class::Std object, as it is a scalar.

So, from this, I am going to finish the project in the way I have started it, but I think from now on there is one golden rule:

Use only one type of object, and just ensure you enforce encapsulation by the way you program - don't get lazy.

Now, to convince my boss to refactor Clearpress into Class::Std...


RobHu said...

Ruby isn't slower because everything is an object. Ruby (1.8) is slower because it has a really crappy interpreter (MRI). Python has a much more mature runtime than Ruby 1.8.

The Ruby 1.9 interpreter / virtual machine is on par or faster than the Python runtime.

As you'll know from recent FooMongers posts, Ruby can be a lot faster. MagLev looks to be many times faster than Ruby 1.8 (6x to 40x they claim...). MagLev will be proprietary, but in the next few years I would expect Rubinius to become the fastest Ruby implementation (it has a similar design to MagLev, but is written from the ground up by amateurs rather than being a port of a 20 year mature Smalltalk VM).

RobHu said...

Er... just making this comment to turn on comment notification...