Perl Police: Developer On The Run (Graph::Bipartite)

Ok, here is my new MO (method of operation):

Whenever I see an awful piece of Perl (even if only a 0.01 version and especially if it is on CPAN), then I analyze it thoroughly, write the developer an email with my findings. And wait for a response.
After 3 emails without response, I rat out the module to the Perl Police. Even if it is a 0.01 version and especially if it is on CPAN. What happens then, is out of my hands....

Graph::Bipartite Guilty as Charged

When evaluating packages to implement a Petri net functionality, I eventually came across Graph::Bipartite. It is supposed to represent a bipartite graph, and - this seems the only agenda - to compute the maximum matching inside the graph.

While the version number 0.01 may be cause for caution, that does not necessarily mean anything. But as the docs are somewhat thin, I risked a look under the hood. Just to make my blood freeze. And then boil.

Fraudulent Object Behaviour

Normally, when you instantiate an object, such as

my $g = Graph::Bipartite->new( 5, 4 );

you have the implicit expectation that all the object values are stored, uhm, inside the object right?

Wrong. Inside the package you find:

package Graph::Bipartite;

my $n1;
my $n2;
my $n;
my @neighbours;

sub new {
    $n1 = $_[ 1 ];
    $n2 = $_[ 2 ];
    $n = $n1 + $n2;
    for( my $i = 0; $i < $n; $i++ ) {
         $neighbours[ $i ] = [];
    }
    my $class = shift;
    my $self = { };
    bless( $self, $class );
    return $self;
}

Everything you pass in is stored in a global variable. Which means you have a lot of fun in your web application (or in your threads). Interestingly, $n2 is kept, but never used later on.

To make matters worse (much worse actually) these are not the only global variables. There is

my @matching;

which holds the current state of affairs during the execution of the overall matching algorithm, and there is

my @level;

which is completely ephemeral.

I simply suspect that the author was too lazy to pass around parameters.

Turning in Cycles

And since we are already there: All you for-loop fetishists, please consider NOT using them. The neighbours array can be initialized easily with

@neighbours = map {[]} 0..$n-1;

Really.

The same applies to one of the methods:

sub maximum_matching {
    for( my $i = 0; $i < $n; ++$i ) {
       $matching[ $i ] = -1;
    }
    ...

That can be replaced by

@matching = -1 x $n;

Really. Although here I would suspect that undef would be more appropriate to mark that there is nothing here.

The method continues with

my %h;
    for( my $i = 0; $i < $n1; ++$i ) {
         if( $matching[ $i ] != -1 ) {
             $h{ $i } = $matching[ $i ];
         }
    }
    %h;

where one wonders why this should be better than

return
    map  { $_ => $matching[$_] }
    grep { $matching[$_] != -1 }
    0 .. $n1-1;

Is there a price somewhere for having as many variables as possible?

Baroque or still Rococo?

The other remaining method also reveals that the author struggled somewhat with the concept of lists:

sub neighbours {
   if( scalar( @{ $neighbours[ $_[ 1 ] ] } ) > 0 ) {
       return scalar( @{ $neighbours[ $_[ 1 ] ] } );
   }
   0;
}

First is checked whether the list contains at least one element. If so, that number is returned, otherwise it is 0.

Now even my cat could simplify this to give me the fu**ing length of the list:

return scalar( @{ $neighbours[ $_[ 1 ] ] } );

I, on the other hand, would have used

return @{ $neighbours[ $_[ 1 ] ] };

to hand over the user the option to have this in a scalar or a list context. And this also would make more obvious that the name of the original method is quite misleading.

We are not finished. Let's look at the insert_edge method:

sub insert_edge {
    push( @{ $neighbours[ $_[ 1 ] ] }, $_[ 2 ] );
    push( @{ $neighbours[ $_[ 2 ] ] }, $_[ 1 ] );
}

and particularily what happens when the user does

$g->insert_edge( 2, 3 );
$g->insert_edge( 2, 3 );

Yes, they will be counted twice. And the neighbours method will not compensate for that either.

Navel Gazing

More disconcerting is the mindset behind the module. It basically shouts:

I have one particular problem and I solve that particular problem, but I pretend it is a generalized solution.

I resent that. Because it wastes my time.

If a package names itself Graph::Bipartite, then my expectation would be that it is (a) a Graph, maybe even a subclass of Graph, so that I can have graph operations. OTOH, that graph package comes with its own set of problems.

And (b) my expectation is that I can have my own nodes. But the nodes in Graph::Bipartite are just integers. To use this package I would have to have yet another mapping between my node set and these artificial numbers. Not overly convenient.

And The Worst Part

The worst part is, that the matching algorithm actually seems to work. And seems to work fast, very fast.

But how could I tell that it is correct? Yes, you guessed it: No test cases. Sheeesh. So no. I personally do not want many people to program in Perl. And push packages onto CPAN.

15000 other CPAN distributions to go.

Posted In

Small nit

Wrong: @matching = -1 x $n;
Right: @matching = (-1) x $n;

dagolden (not verified) | Thu, 05/21/2009 - 18:12

take ownership

Its was uploaded in 2003, the author probably does not use perl anymore, there is a procedure to take control of a module if the owner does not want to maintain it .

As far as your comment on you do not want many people to program in perl well its your personal opinion but I am glad the community does not feel that way .

lpbck1 (not verified) | Thu, 05/21/2009 - 20:10

Re: take ownership

... there is a procedure to take control of a module ...

Yes, I already have done this once.

But I am very careful and make great efforts to track down the author. This takes quite some time.

... glad the community does not feel that way .

Maybe. But there are different languages for differently minded people. And getting all PHP programmers back is NOT something I would look forward to.

rho | Thu, 05/21/2009 - 20:13

maybe you should fix <a

maybe you should fix <a href=http://matrix.cpantesters.org/?author=drrho>your modules </a> to pass their failing tests and then start looking at others

Anonymous (not verified) | Thu, 05/21/2009 - 21:27

Re: maybe you should fix

... then start looking at others

Rightio. This is exactly how this world works. If I would not know it better, though, I think you may have missed the point.

rho | Thu, 05/21/2009 - 22:53