On 2011-08-03 12:16, Deyan Ginev wrote:

I have heard a lot of bashing of the thread support in Perl and tried it
myself with limited success.

For example, some months ago I tried the standard "use threads;" and
remember seeing a segfault when it reached the "join" (but everything
worked well otherwise, which gave me a mixed feeling).

I will soon have access to a cluster of hexacore machines and am
thinking of the best way to utilize them - obviously threaded programs
would get me a long way in such a setup. Is there a "best" way to do
threads in Perl? Do you know of any production-ready applications that
are using Perl threads?

If at all possible, design in a 'map-reduce-merge' way
(which is basically just 'init / process / activate' anyway).

Then use a perl binary that doesn't support threading,
because it is about 20% faster.

And fork.


My code often (runs on 24-core boxes with 96 GB RAM, and) looks like:

#!/usr/bin/perl -wl
use strict;

# Each singer (=child) sings (=process)
# all of the lines (=job: ordered set of tasks).

use Data::Dumper;
use Parallel::Series;

my $NAME = "Brother Jacob Song";

my @lines = split /\n/, <<'EOT';
Brother Jacob - Brother Jacob
Sleeping still? - Sleeping still?
Morning bells are ringing! - Mornings bells are ringing!
Ding, dang, dong - Ding, dang, dong
EOT

my @TASKS = map +{ ix => $_ + 1, info => $lines[ $_ ] },
                0 .. $#lines;

my $todo = Parallel::Series::->new(
    DEBUG   => 0,
    NAME    => $NAME,
    TASKS   => \@TASKS,
);

$todo->set(
    map    => \&init,      # returns \@jobs

    reduce => \&process,   # a child's job is to process an
                           # ordered set of (one or more) tasks

    merge  => \&activate,  # wrap up
);

$todo->LOG( 3, 'starting: %s', Dumper( $todo ) );

$todo->run;

exit 0;

# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

sub init {
    my ( $self ) = @_;
    $self->LOG( 2, '>>> init <<<' );
    return [ map [ $_ ], 'A' .. 'F' ];  # singers
}

sub process {
    my ( $self, $chunk, $task ) = @_;
    sleep 0.7;

    my $ok = ( rand > 0.2 ? 1 : 0 );  # simulate failure
    $self->LOG( 0, q(processed: chunk=%s.%s: %s),
                   join( '-', @$chunk ),
                   $task->{ ix },
                   ( $ok ? $task->{ info } : '<>' ) );
    return $ok;
}

sub activate {
    my ( $self, $skip ) = @_;
    $self->LOG( 2, '>>> activate <<<' );
    $self->LOG( 3, q(activate: skip=%s), $skip || 0 );
    return;
}

__END__



I haven't put Parallel::Series on CPAN yet,
but there are several fine alternatives there.

--
Ruud

--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/


Reply via email to