On 05/21/2012 12:40 PM, sono-io wrote:
David,
        Are you saying that it would be faster to do:
my $this_date = shift;
my $output = shift;
        as opposed to:
my ($this_date, $output) = @_;
        or am I not reading your assessment correctly?

1. Benchmarking on the target (production) computer is the only meaningful measure of which is faster. You should benchmark those on your computer as a learning exercise. Please post your code and results.

2. As others have pointed out, there are other areas that can have a far greater impact on the performance of your code than shift/ assign/ direct access of function arguments -- notably data structures and algorithms. But your question is still valid, because there are important differences between the various approaches.

3. I do agree with your (05/21/2012 01:12 PM) comment to the effect of striving to write efficient code in the first place.

4. Performance vs. clarity is a trade-off. Money is the fundamental quantitative measure (CPU's cost money, memory costs money, power costs money, your time is worth money, etc.). Confusion, stress, human relationships, style, and opinion are qualitative measures. There are others. The stakeholders must decide what is important, and then approach the software accordingly. If it's your code for your personal use, you decide.


I would like to make a counter argument to the "clarity first" position (even though I previously advocated it).


I was recently playing around with the Sieve of Eratosthenes (finds all the prime numbers less than or equal to N) using threads, recursion, and iteration in Perl (code at end; somewhat dated).


I'll let the readers judge the various scripts for clarity.


Here are the performance benchmarks. Note that N increases by a factor of 10 with each script (from slowest to fastest), the first script crashes after ~364 prime numbers (and produces a truncated answer), and the second generates a warning after ~100 prime numbers (but produces the correct answer):

2012-05-21 17:33:18 dpchrist@p43400e ~/sandbox/perl
$ time perl prime-threads.pl 10000 | wc
Thread creation failed: pthread_create returned 11 at prime-threads.pl line 18.
ran out of threads at prime-threads.pl line 20.
    364     364    1620

real    0m23.668s
user    0m24.494s
sys     0m15.949s

2012-05-21 17:36:08 dpchrist@p43400e ~/sandbox/perl
$ time perl prime-recursion.pl 100000 | wc
Deep recursion on subroutine "main::check_num" at prime-recursion.pl line 14.
   9593    9593   56128

real    0m22.796s
user    0m21.217s
sys     0m1.252s

2012-05-21 17:37:37 dpchrist@p43400e ~/sandbox/perl
$ time perl prime-iterative3c.pl 1000000 | wc
  78499   78499  538470

real    0m4.684s
user    0m4.584s
sys     0m0.052s


Then I found an even better iterative algorithm:

2012-05-21 18:31:42 dpchrist@p43400e ~/sandbox/perl
$ time perl sieve2b.pl 10000000 | wc
 664579  664579 5227116

real    0m5.487s
user    0m5.368s
sys     0m0.208s


So, recursion is ~10x faster than threads, and iteration can be ~400x faster yet.


Here's a CPAN library for comparison (Math::Prime::FastSieve):

2012-05-21 18:32:09 dpchrist@p43400e ~/sandbox/perl
$ time perl prime-lib.pl 100000000 | wc
5761456 5761456 51099002

real    0m9.414s
user    0m8.885s
sys     0m0.516s

CPAN beats my best iterative script by ~6x using XS, C++, iteration, and bit vectors.


So, data structures, algorithms, and implementation details can make huge differences in performance.


Okay, so what happens when N gets even bigger? Say 2^32? Or 2^64? Or 10^100? My computer runs out of memory and grinds to a halt for N = 1.0E+09 using Math::Prime::FastSieve. Even if I had enough memory, ~6 minutes to compute all the primes up to N = 2^32 is still too slow.


When performance really matters -- there is an even faster algorithm:

    http://en.wikipedia.org/wiki/Sieve_of_Atkin


Looking at the diagrams and reading the text, I can pretend to understand it. But once it's reduced to tight code? I seriously doubt it. It would be unclear, but it would be even faster than my other (Perl or XS) choices.


Performance is what makes the difference between practical and impractical software. Clear but impractical is, by definition, of no use. Difficult but practical has value.


Therefore, performance is first and clarity is second.


HTH,

David

--

2012-05-21 18:32:43 dpchrist@p43400e ~/sandbox/perl
$ cat prime-threads.pl
#!/usr/bin/perl
# perldoc perlthrtut
use strict;
use warnings;
use threads;
use Thread::Queue;
sub check_num {
    my ($upstream, $cur_prime) = @_;
    my $kid;
    my $downstream = Thread::Queue->new();
    while (my $num = $upstream->dequeue()) {
        next unless $num % $cur_prime;
        if ($kid) {
            $downstream->enqueue($num);
        }
        else {
            print $num, "\n";
            $kid = threads->create(\&check_num, $downstream, $num);
            if (!$kid) {
                warn "ran out of threads";
                last;
            }
        }
    }
    if ($kid) {
        $downstream->enqueue(undef);
        $kid->join;
    }
}
my $upper = @ARGV ? shift : 500;
my $iter  = @ARGV ? shift : 1;
die "Usage: $0 [UPPER [ITER]]"
    unless 2 < $upper && $upper == int($upper)
        && 0 < $iter  && $iter  == int($iter);
for (1 .. $iter) {
    my $stream = Thread::Queue->new(3..$upper, undef);
    print "1\n2\n";
    check_num($stream, 2);
}



2012-05-21 18:33:09 dpchrist@p43400e ~/sandbox/perl
$ cat prime-recursion.pl
#!/usr/bin/perl
# perldoc perlthrtut
use strict;
use warnings;
sub check_num {
    my ($prime, $ra_nums) = @_;
    print $prime, "\n";
    my @stream;
    foreach my $num (@$ra_nums) {
        next if $num % $prime == 0;
        push @stream, $num;
    }
    if (scalar @stream) {
        check_num(shift @stream, \@stream);
    }
}
my $upper = @ARGV ? shift : 500;
my $iter  = @ARGV ? shift : 1;
die "Usage: $0 [UPPER [ITER]]"
    unless 2 < $upper && $upper == int($upper)
        && 0 < $iter  && $iter  == int($iter);
for (1 .. $iter) {
    print "1\n";
    my @stream = (2 .. $upper);
    check_num(shift @stream, \@stream);
}



2012-05-21 18:33:17 dpchrist@p43400e ~/sandbox/perl
$ cat prime-iterative3c.pl
#!/usr/bin/perl
# perldoc perlthrtut
use strict;
use warnings;
use integer;
my $upper = @ARGV ? shift : 500;
my $max_prime = sqrt($upper);
my $iter  = @ARGV ? shift : 1;
die "Usage: $0 [UPPER [ITER]]"
    unless 2 < $upper && $upper == int($upper)
        && 0 < $iter  && $iter  == int($iter);
for (1 .. $iter) {
    my @primes = (1, 2);
    my $in = [];
    for (my $i = 3; $i < $upper; $i += 2) { push @$in, $i }
    while (scalar @$in) {
        my $prime = shift @$in;
        push @primes, $prime;
        last if $max_prime < $prime;
        my $out = [ grep {$_ % $prime} @$in ];
        $in = $out;
    }
    print join("\n", @primes, @$in), "\n";
}



2012-05-21 18:33:24 dpchrist@p43400e ~/sandbox/perl
$ cat sieve2b.pl
#!/usr/bin/perl
# http://c2.com/cgi/wiki?SieveOfEratosthenesInManyProgrammingLanguages
# with some code cleanup and a few optimizations
use strict;
use warnings;
use integer;
my $n = shift or die "Usage: sieve2.pl UPPER";
my $sqrtn = sqrt($n);
my @c;
for(my $t = 3; $t < $sqrtn; $t += 2) {
    unless ($c[$t]) {
        my $twot = $t << 1;
        for(my $s = $t * $t; $s < $n; $s += $twot) {
            $c[$s]++;
        }
    }
}
print "2\n";
for(my $t = 3; $t < $n; $t += 2) {
    print "$t\n" unless $c[$t];
}



2012-05-21 18:33:31 dpchrist@p43400e ~/sandbox/perl
$ cat prime-lib.pl
#!/usr/bin/perl
use strict;
use warnings;
use Math::Prime::FastSieve      qw( primes );
my $upper = @ARGV ? shift : 500;
my $iter  = @ARGV ? shift : 1;
die "Usage: $0 [UPPER [ITER]]"
    unless 2 < $upper && $upper == int($upper)
        && 0 < $iter  && $iter  == int($iter);
for (1 .. $iter) {
    my $aref = primes($upper);
    print join("\n", 1, @$aref), "\n";
}

--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/


Reply via email to