Re: shift vs @_

David Christensen Mon, 21 May 2012 20:11:18 -0700

On 05/21/2012 12:40 PM, sono-io wrote:

David,
        Are you saying that it would be faster to do:
my $this_date = shift;
my $output = shift;
        as opposed to:
my ($this_date, $output) = @_;
        or am I not reading your assessment correctly?

1. Benchmarking on the target (production) computer is the onlymeaningful measure of which is faster. You should benchmark those onyour computer as a learning exercise. Please post your code and results.

2. As others have pointed out, there are other areas that can have afar greater impact on the performance of your code than shift/ assign/direct access of function arguments -- notably data structures andalgorithms. But your question is still valid, because there areimportant differences between the various approaches.

3. I do agree with your (05/21/2012 01:12 PM) comment to the effect ofstriving to write efficient code in the first place.

4. Performance vs. clarity is a trade-off. Money is the fundamentalquantitative measure (CPU's cost money, memory costs money, power costsmoney, your time is worth money, etc.). Confusion, stress, humanrelationships, style, and opinion are qualitative measures. There areothers. The stakeholders must decide what is important, and thenapproach the software accordingly. If it's your code for your personaluse, you decide.

I would like to make a counter argument to the "clarity first" position(even though I previously advocated it).

I was recently playing around with the Sieve of Eratosthenes (finds allthe prime numbers less than or equal to N) using threads, recursion, anditeration in Perl (code at end; somewhat dated).



I'll let the readers judge the various scripts for clarity.

Here are the performance benchmarks. Note that N increases by a factorof 10 with each script (from slowest to fastest), the first scriptcrashes after ~364 prime numbers (and produces a truncated answer), andthe second generates a warning after ~100 prime numbers (but producesthe correct answer):


2012-05-21 17:33:18 dpchrist@p43400e ~/sandbox/perl
$ time perl prime-threads.pl 10000 | wc

Thread creation failed: pthread_create returned 11 at prime-threads.plline 18.

ran out of threads at prime-threads.pl line 20.
    364     364    1620

real    0m23.668s
user    0m24.494s
sys     0m15.949s

2012-05-21 17:36:08 dpchrist@p43400e ~/sandbox/perl
$ time perl prime-recursion.pl 100000 | wc

Deep recursion on subroutine "main::check_num" at prime-recursion.plline 14.

   9593    9593   56128

real    0m22.796s
user    0m21.217s
sys     0m1.252s

2012-05-21 17:37:37 dpchrist@p43400e ~/sandbox/perl
$ time perl prime-iterative3c.pl 1000000 | wc
  78499   78499  538470

real    0m4.684s
user    0m4.584s
sys     0m0.052s


Then I found an even better iterative algorithm:

2012-05-21 18:31:42 dpchrist@p43400e ~/sandbox/perl
$ time perl sieve2b.pl 10000000 | wc
 664579  664579 5227116

real    0m5.487s
user    0m5.368s
sys     0m0.208s

So, recursion is ~10x faster than threads, and iteration can be ~400xfaster yet.



Here's a CPAN library for comparison (Math::Prime::FastSieve):

2012-05-21 18:32:09 dpchrist@p43400e ~/sandbox/perl
$ time perl prime-lib.pl 100000000 | wc
5761456 5761456 51099002

real    0m9.414s
user    0m8.885s
sys     0m0.516s

CPAN beats my best iterative script by ~6x using XS, C++, iteration, andbit vectors.

So, data structures, algorithms, and implementation details can makehuge differences in performance.

Okay, so what happens when N gets even bigger? Say 2^32? Or 2^64? Or10^100? My computer runs out of memory and grinds to a halt for N =1.0E+09 using Math::Prime::FastSieve. Even if I had enough memory, ~6minutes to compute all the primes up to N = 2^32 is still too slow.



When performance really matters -- there is an even faster algorithm:

    http://en.wikipedia.org/wiki/Sieve_of_Atkin

Looking at the diagrams and reading the text, I can pretend tounderstand it. But once it's reduced to tight code? I seriously doubtit. It would be unclear, but it would be even faster than my other(Perl or XS) choices.

Performance is what makes the difference between practical andimpractical software. Clear but impractical is, by definition, of nouse. Difficult but practical has value.



Therefore, performance is first and clarity is second.


HTH,

David

--

2012-05-21 18:32:43 dpchrist@p43400e ~/sandbox/perl
$ cat prime-threads.pl
#!/usr/bin/perl
# perldoc perlthrtut
use strict;
use warnings;
use threads;
use Thread::Queue;
sub check_num {
    my ($upstream, $cur_prime) = @_;
    my $kid;
    my $downstream = Thread::Queue->new();
    while (my $num = $upstream->dequeue()) {
        next unless $num % $cur_prime;
        if ($kid) {
            $downstream->enqueue($num);
        }
        else {
            print $num, "\n";
            $kid = threads->create(\&check_num, $downstream, $num);
            if (!$kid) {
                warn "ran out of threads";
                last;
            }
        }
    }
    if ($kid) {
        $downstream->enqueue(undef);
        $kid->join;
    }
}
my $upper = @ARGV ? shift : 500;
my $iter  = @ARGV ? shift : 1;
die "Usage: $0 [UPPER [ITER]]"
    unless 2 < $upper && $upper == int($upper)
        && 0 < $iter  && $iter  == int($iter);
for (1 .. $iter) {
    my $stream = Thread::Queue->new(3..$upper, undef);
    print "1\n2\n";
    check_num($stream, 2);
}



2012-05-21 18:33:09 dpchrist@p43400e ~/sandbox/perl
$ cat prime-recursion.pl
#!/usr/bin/perl
# perldoc perlthrtut
use strict;
use warnings;
sub check_num {
    my ($prime, $ra_nums) = @_;
    print $prime, "\n";
    my @stream;
    foreach my $num (@$ra_nums) {
        next if $num % $prime == 0;
        push @stream, $num;
    }
    if (scalar @stream) {
        check_num(shift @stream, \@stream);
    }
}
my $upper = @ARGV ? shift : 500;
my $iter  = @ARGV ? shift : 1;
die "Usage: $0 [UPPER [ITER]]"
    unless 2 < $upper && $upper == int($upper)
        && 0 < $iter  && $iter  == int($iter);
for (1 .. $iter) {
    print "1\n";
    my @stream = (2 .. $upper);
    check_num(shift @stream, \@stream);
}



2012-05-21 18:33:17 dpchrist@p43400e ~/sandbox/perl
$ cat prime-iterative3c.pl
#!/usr/bin/perl
# perldoc perlthrtut
use strict;
use warnings;
use integer;
my $upper = @ARGV ? shift : 500;
my $max_prime = sqrt($upper);
my $iter  = @ARGV ? shift : 1;
die "Usage: $0 [UPPER [ITER]]"
    unless 2 < $upper && $upper == int($upper)
        && 0 < $iter  && $iter  == int($iter);
for (1 .. $iter) {
    my @primes = (1, 2);
    my $in = [];
    for (my $i = 3; $i < $upper; $i += 2) { push @$in, $i }
    while (scalar @$in) {
        my $prime = shift @$in;
        push @primes, $prime;
        last if $max_prime < $prime;
        my $out = [ grep {$_ % $prime} @$in ];
        $in = $out;
    }
    print join("\n", @primes, @$in), "\n";
}



2012-05-21 18:33:24 dpchrist@p43400e ~/sandbox/perl
$ cat sieve2b.pl
#!/usr/bin/perl
# http://c2.com/cgi/wiki?SieveOfEratosthenesInManyProgrammingLanguages
# with some code cleanup and a few optimizations
use strict;
use warnings;
use integer;
my $n = shift or die "Usage: sieve2.pl UPPER";
my $sqrtn = sqrt($n);
my @c;
for(my $t = 3; $t < $sqrtn; $t += 2) {
    unless ($c[$t]) {
        my $twot = $t << 1;
        for(my $s = $t * $t; $s < $n; $s += $twot) {
            $c[$s]++;
        }
    }
}
print "2\n";
for(my $t = 3; $t < $n; $t += 2) {
    print "$t\n" unless $c[$t];
}



2012-05-21 18:33:31 dpchrist@p43400e ~/sandbox/perl
$ cat prime-lib.pl
#!/usr/bin/perl
use strict;
use warnings;
use Math::Prime::FastSieve      qw( primes );
my $upper = @ARGV ? shift : 500;
my $iter  = @ARGV ? shift : 1;
die "Usage: $0 [UPPER [ITER]]"
    unless 2 < $upper && $upper == int($upper)
        && 0 < $iter  && $iter  == int($iter);
for (1 .. $iter) {
    my $aref = primes($upper);
    print join("\n", 1, @$aref), "\n";
}

--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/

Re: shift vs @_

Reply via email to