Sunday 11 October 2009

Is it easier to be specific?

I'm re-assessing some code, trying to refactor it into a re-usable role. Simple, you might think? But is it?

It is very easy to write your code to perform a task on a specific item, or in a specific way. From variable names which mean something tangible, to methods designed to act on a pathway which is unique to your production setup. But, how do you make it more usable?

1 - Variable-Method names

I was taught to make my variable/method names mean something. This makes the code more readable.

$donut = q{jam donut};
...
eat($donut);

instead of

$d = q{jam donut};
...
eat($d);

This is a trivial example, but the principle is there.

However, you (read I) can take it too far. One such point is in directories. For the analysis pipeline, we end up with a directory, after a step called GERALD, called GERALD-date.

In the code, we put this into $gerald_dir. Sounds reasonable. Everywhere I read $gerald_dir, I know exactly what it represents.

However, here is the problem. What happens when the step and directory are renamed Harold. Whilst the principle is the same, and the same files are there, suddenly the variable name is wrong. Just grepping the filesystem won't find something like Gerald. At this point you are probably screaming at me, give it a semantic name, and document what it represents.

Exactly, but that is easy with internal local variable names, not so with public exposed method names. Suddenly I need the role to include a deprecation cycle for the replacement method names, aaahhh!

2 - Application/Locally specific

Anyone should argue that locally specific logic should exist as far up as possible, leaving it out of the generic process end logic as much as possible. No arguements there.

But how do you determine which is app specific, and which is generic.

Obviously, naming conventions are app specific. Or are they? Many things might need to know how to construct a filename in a particular way.

Ok, then how about directory structure? Again, you may find many apps wanting to access the Recalibrated data dir. They all need to know how to do get there.

This is, as you can see, quite a grey area. One where, as I am finding, I think the refactor into a generic role still needs to be a little more specific than might be first thought. You can't get to a directory without at least some knowledge of where it is likely to be. You can't open a file without at least some knowledge of how it is named.

Determine a row in a database - Some way of constructing a query to get it...

Paul Weller said "No one ever said it was gonna be easy", and I wouldnt want it any other way, but remember, if you want your code to be reusable, try to make names semantic, but not specific, and document what they represent.

Also, ensure you document why you process with particular assumptions. At least then, no-one can say you didn't tell them.

That is my plan at least (where I can) from now on.

No comments: