On 2012-05-21 21:10, David Christensen wrote:
On 05/21/2012 12:40 PM, sono-io wrote:
David,
Are you saying that it would be faster to do:
my $this_date = shift;
my $output = shift;
as opposed to:
my ($this_date, $output) = @_;
or am I not reading your assessment correctly?
1. Benchmarking on the target (production) computer is the only
meaningful measure of which is faster. You should benchmark those on
your computer as a learning exercise. Please post your code and results.
2. As others have pointed out, there are other areas that can have a far
greater impact on the performance of your code than shift/ assign/
direct access of function arguments -- notably data structures and
algorithms. But your question is still valid, because there are
important differences between the various approaches.
3. I do agree with your (05/21/2012 01:12 PM) comment to the effect of
striving to write efficient code in the first place.
4. Performance vs. clarity is a trade-off. Money is the fundamental
quantitative measure (CPU's cost money, memory costs money, power costs
money, your time is worth money, etc.). Confusion, stress, human
relationships, style, and opinion are qualitative measures. There are
others. The stakeholders must decide what is important, and then
approach the software accordingly. If it's your code for your personal
use, you decide.
I would like to make a counter argument to the "clarity first" position
(even though I previously advocated it).
I was recently playing around with the Sieve of Eratosthenes (finds all
the prime numbers less than or equal to N) using threads, recursion, and
iteration in Perl (code at end; somewhat dated).
I'll let the readers judge the various scripts for clarity.
Here are the performance benchmarks. Note that N increases by a factor
of 10 with each script (from slowest to fastest), the first script
crashes after ~364 prime numbers (and produces a truncated answer), and
the second generates a warning after ~100 prime numbers (but produces
the correct answer):
2012-05-21 17:33:18 dpchrist@p43400e ~/sandbox/perl
$ time perl prime-threads.pl 10000 | wc
Thread creation failed: pthread_create returned 11 at prime-threads.pl
line 18.
ran out of threads at prime-threads.pl line 20.
364 364 1620
real 0m23.668s
user 0m24.494s
sys 0m15.949s
2012-05-21 17:36:08 dpchrist@p43400e ~/sandbox/perl
$ time perl prime-recursion.pl 100000 | wc
Deep recursion on subroutine "main::check_num" at prime-recursion.pl
line 14.
9593 9593 56128
real 0m22.796s
user 0m21.217s
sys 0m1.252s
2012-05-21 17:37:37 dpchrist@p43400e ~/sandbox/perl
$ time perl prime-iterative3c.pl 1000000 | wc
78499 78499 538470
real 0m4.684s
user 0m4.584s
sys 0m0.052s
Then I found an even better iterative algorithm:
2012-05-21 18:31:42 dpchrist@p43400e ~/sandbox/perl
$ time perl sieve2b.pl 10000000 | wc
664579 664579 5227116
real 0m5.487s
user 0m5.368s
sys 0m0.208s
So, recursion is ~10x faster than threads, and iteration can be ~400x
faster yet.
Here's a CPAN library for comparison (Math::Prime::FastSieve):
2012-05-21 18:32:09 dpchrist@p43400e ~/sandbox/perl
$ time perl prime-lib.pl 100000000 | wc
5761456 5761456 51099002
real 0m9.414s
user 0m8.885s
sys 0m0.516s
CPAN beats my best iterative script by ~6x using XS, C++, iteration, and
bit vectors.
So, data structures, algorithms, and implementation details can make
huge differences in performance.
Okay, so what happens when N gets even bigger? Say 2^32? Or 2^64? Or
10^100? My computer runs out of memory and grinds to a halt for N =
1.0E+09 using Math::Prime::FastSieve. Even if I had enough memory, ~6
minutes to compute all the primes up to N = 2^32 is still too slow.
When performance really matters -- there is an even faster algorithm:
http://en.wikipedia.org/wiki/Sieve_of_Atkin
Looking at the diagrams and reading the text, I can pretend to
understand it. But once it's reduced to tight code? I seriously doubt
it. It would be unclear, but it would be even faster than my other (Perl
or XS) choices.
Performance is what makes the difference between practical and
impractical software. Clear but impractical is, by definition, of no
use. Difficult but practical has value.
Therefore, performance is first and clarity is second.
Would you not agree that these are pretty extreme cases to be making
such a wide-reaching decision on? I don't know your experience, but mine
is that I've only encountered extremely elaborate code that requires
extreme optimization a few times.
Please remember... this is a *beginner* list. Promoting premature
optimization before clarity is not in the best interest of the intended
audience imho.
Steve
--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/