Perl Idea: GearmanX::Worker

I am now in the process to move a number of CPU intensive jobs into Gearman workers.

What Is

Unfortuntely, the required boilerplate code for that is quite crufty:

First, inside your workers you are supposed to do this:

use Gearman::Worker;
my $worker = Gearman::Worker->new;

That object is not what it appears to be. It is not a worker, but actually the handle to the job server. Yes, good naming is an long-lost art.

Annoyingly the constructor does not accept the location of the job server, instead you have to call yet another method:

$worker->job_servers('127.0.0.1');

Then you are supposed to register all worker code with the handle:

$worker->register_function('client_calling'
                            => sub { ... } );

The key is the name a client will use when launching the job, the reference to the sub is then invoked inside the worker process.

The Cruft Goes On

Inside the worker code the first parameter to consume is that of a Gearman::Job. You will not be overly surprised that that class is left undocumented.

sub men_at_work {
   my $job = shift;

If your job receives parameters, then you will have to live with the restriction that gearman itself only passes around strings. So everything more complex than a string will have to be frozen by the client and thawed up by your sub:

use Storable qw(thaw freeze);
   my ($ignitable) = thaw $job->arg;

Same thing when you need to pass back complex results:

return freeze [ 1, 2, 3 ];
}

That is all quite messy.

And What Could Be

Here my initial thoughts how this could look like as well:

package MyWorker;
use base qw(GearmanX::Worker);

sub men_at_work  :expose(client_calling)  {
   my $param = shift;
   return 23 ;
}

1;

Here the user simply inherits from GearmanX::Worker and defines one or more sub as exposable using an attribute on the subs. If the explicit client name is omitted, it will fall back to the internal one (men_at_work).

Outside the package the main program will generate worker instances:

my $w = new MyWorker;

These are really workers, but they are still inactive. To start them you do one of the following:

$w->run_as_thread;  # will create thread & detach
$w->run_as_process; # will fork & return
$w->run;            # will block here

To Storable Or Not To Storable

To control whether the parameters have to be thawed or whether the result has to be frozen another attribute can be used:

sub men_at_work  :expose(client_calling)
                 :storable (in) {
   my @params = @_;
   return 23 ;
}

Above only on the incoming side Storable will be used, not on the outgoing side. If you also plan to return complex results, then say so with out:

sub men_at_work  :expose(client_calling)
                 :storable(in, out) {
   my @params = @_;
   return [ 1, 2, 3 ];
}

I have all this working here, but wait for an inspiration what to do on the client side.

Ideas?

Posted In

storable versus other serializations and Kiukudb

Hi,

Would using Storable not make it hard to submit a job in Perl but handle it via Java or even C, via the language bindings? Storable is quite fast though, although the value of a distributed queue lies in your horizontal scaling ability, not dead fast vertical speed.

If you wanted to stay in Perl, might consider looking at something like they wat KiukuDB serializes Perl Moose Objects. Then objects could be deserialized on the other side, so instead of passing parameters you get a full instance with all the behaviors and methods needed to perform the job.

Actually was thinking this could be a great Moose::Meta::Method subclass, where you'd have a method that automatically would serialize it's containing object and pass on to a job queue for processing. The method would return a job id handle object you could poll for status. Like

class MyApp {

method slow_sql_query as Queued {
return $self->dbi_do('[big multi join SQL]');
}
}

Then in your instance

my $job = $myapp->slow_sql_query;

while( $job->not_complete ) {
warn 'job still running'
}

print $job->return

I realize you may not be so familiar with Moose or Moose::Declare, but feel free to look me up on IRC or message me since I am starting on something like this later in the month.

john

John Napiorkowski (not verified) | Sat, 05/30/2009 - 17:09

If you're planning to own the protocol ...

... then I'd probably just Storable anything that's a ref and tack a flags character on the front (say 'S' for storable and 'P' for plain string, pick better names as you wish).

More importantly: nfreeze, not freeze! You do *not* want to be byte order dependent, x86 and x86_64 don't have the same endianness and that -will- bite you in the ass later.

Matt S Trout (not verified) | Sat, 05/30/2009 - 19:33