Thursday, 29 July 2010

A word to the Mooses out there (Miice?)

Just found today with upgrade to latest Perl::Critic

Subroutines::ProhibitUnusedPrivateSubroutines

this throws a problem with all the many

_build_

as it thinks they are unused private subroutines, not seeing them elsewhere in the code

To fix (thanks to the CPAN documentation for helping me get to this) add the following in your (.)perlcriticrc

[Subroutines::ProhibitUnusedPrivateSubroutines]
private_name_regex = _(?!build_)\w+

This will pattern match and allow anything beginning _build_

Cheers

Thursday, 22 July 2010

Difficult to track bug

Here was a difficult bug to spot (using 5.8.8 and 5.10.1, not tried on 5.12):



my $output_path = $self->output_path();
my $bam_filename = $self->bam_filename_root();

if ( some condition ) {

$output_path .= q{lane} . $self->position_decode_string() . q{/}.
$bam_filename .= $self->position_decode_string();

}


This was caused by a bit of lack of due care and attention after a bit of copy and paste refactoring.

Unfortunately, the code is perfectly legit, and parses as though the .= after $bam_filename is just a ., without also doing the concat to $bam_filename.

Bit of a pain in the backside to find this one.

Monday, 12 July 2010

A small journey in Benchmarking

20ish lines of verbose code involving hashes, arrays and grouping, in comparison to a magic piece of regex which does in 3 lines the same thing.


use Benchmark q{:all};

my @lsf_indices = ( 1000,1001,1002,1003,1006,3000,3300,3301,3302,3303,3304,3305,3306,3998,3999,4000,4001,4002);

my %methods = (

regex => sub {
my $array_string = join q{,}, @lsf_indices;
$array_string =~ s/\b(\d+)(,((??{$+ + 1}))\b)+/$1-$+/g;
$array_string = q{[} . $array_string . q{]};
},
verbose => sub {
my ( $previous, $current_working_index );
my %consecutive;

foreach my $index ( @lsf_indices ) {

if ( $previous && ( $index == $previous + 1 ) ) {
push @{ $consecutive{ $current_working_index } }, $index;
$previous = $index;
} else {
$previous = $index;
$current_working_index = $index;
push @{ $consecutive{ $current_working_index } }, $index;
}

}

my @array;
foreach my $index ( sort { $a <=> $b } keys %consecutive ) {

if ( scalar @{ $consecutive{$index} } == 1 ) {

push @array, qq{$consecutive{$index}->[0]};

} else {

my $last = pop @{ $consecutive{$index} };
my $first = shift @{ $consecutive{$index} };
push @array, $first . q{-} . $last;

}

}

my $array_string = q{[} . ( join q{,}, @array ) . q{]};
},

);

cmpthese( 30_000, \%methods);
timethese( 30_000, \%methods);


Result


home$ ./benchmark.pl
Rate regex verbose
regex 6048/s -- -72%
verbose 21898/s 262% --
Benchmark: timing 30000 iterations of regex, verbose...
regex: 5 wallclock secs ( 4.96 usr + 0.01 sys = 4.97 CPU) @ 6036.22/s (n=30000)
verbose: 1 wallclock secs ( 1.37 usr + 0.00 sys = 1.37 CPU) @ 21897.81/s (n=30000)


Verbose lines of code is around 5 times faster. Happy :) Code I can read, and speed benefits as well.

Admittedly, a bit of an exercise, since this isn't really a bottleneck. ;)