Thursday, 4 March 2010

crontab or daemon

So here is an interesting choice I need to make.

I have just rewritten a bit of code to email interested parties when a run with their data on it reaches 2 points. One when it reaches run complete (i.e. the instrument has done all it's processing) and then again when the data has been post-processed and qc'd and deposited in the central archive space for them to obtain.

I'm quite pleased with the code. It is more robust than the previous hack which we had never intended to be all encompassing, and actually mails the parties that should be interested (rather than some 'user' which may or may not be the right person).

It is, of course, also written using Moose.

However, now I have to decide, which do I choose, a cronjob, or a daemon process.

Cronjob:

Pros - very quick. Just decide how often to launch it, and run the script.
Cons - need to remember which node the cronjob is running, need to do something with the outputs (logs, etc), need to ensure that jobs don't relaunch on top of each other

Daemon:

Pros - Use a monitor to keep us informed it is still running, cyclical so won't launch over each other, write to a log file easy
Cons - Need to write a daemon controller script

I'm sure that there are others, I'm mostly babbling and writing this down as I think. Certainly, for the first release of this, I will start it as a cronjob, but down the line, I think I will move this to a Daemon, once the script has been in a production environment for a while. (i.e. we know it is working correctly!)

4 comments:

NPEREZ said...

Regarding your negatives on the cronjob: if you are using Moose, and you are doing anything moderately complex (or plan to down the road), the startup time will definitely become an issue. At work we switched to a daemon process because the cron jobs were overlapping (and things ran in serial which mean a long running job clogged the pipe)

That said, it pushed me to develop and release POEx::WorkerPool which nicely solved the problem of job processing, and I threw together a daemon using POE that handled signals, etc.

Steven Haryanto said...

To tell you the truth, I hate writing daemons. It's a bigger responsibility: harder to get right, need to take care of issues like concurrency, memory leaks, security, client timeouts, process management, etc. Sure there are frameworks to help you but the fact is, everything becomes more complex.

So I avoid it whenever I can.

Btw, for detecting cronjob overlapping, you can add a single line of Proc::PID::File test.

Steven Haryanto said...

As for Moose, try to develop for Mouse (or Any::Moose). 99% of Moose apps should run fine with Mouse. It can really cut your startup time by 50-75% or more.

Unknown said...

Thanks for your comments.

May well need to take a look at POEx::WorkerPool.

Cheers

Andy