Hello,
In order to measure function call performance I am using simple and dumb
fibonacci numbers calculation. It is equivalent to the PIR example in
examples/benchmark/fib.pir
The perl6 code is :
use v6;
sub fib ($n){
return $n if $n<2;
return fib($n-1)+fib($n-2);
}
say fib(22);
Here are my observations.
1. PIR variant.
It is quite fast. It calculates fib(32) in 4-5 seconds. It is comparable
to perl5 performance of 5-6 seconds. In well optimized builds it was
amazingly fast. Before merge of ppd25cx branch, on i686 arch with "-Ot"
flag it performed fib(32) in fraction of a second. I have tried to
reproduce the same performance with combination of optimization flags
(1,2,p,c) and different run-cores without success. Also on amd64
architecture it does not perform considerably better with -Ot flag.
2. Rakudo
The perl6 variant performs quite poorly. I have waited more of 20
minutes to calculate fib(32) but it have not finished the calculation (2
x Xeon 3GHz QuadCore, 2G RAM). It has also used all of the 2G RAM. So I
have chosen fib(22) as a base for comparison.
Typically rakudo calculates fib(22) in 7-8 seconds. (NQP variant
calculates it with comparable performance). After ppd25cx merge it
performed the same as before. Ppd25cx builds are with optimization flag
-O1 because when configured with -O2 (the default optimization level in
the configuration) parrot crashes - it could not even compile Rakudo and
NQP interpreters.
Last night I have checked the performance again and I was disappointed -
perl6 calculated fib(22) in more than 40 seconds... I think this is a
bad regression.
I have compared the PIR output produced by the compiler and the produced
code is without change. Therefore there is a change in parrot that
affected the perl6 function calls without affecting all parrot function
calls.
So I have tried to find the change that is detrimental to perl6 function
call performance. This are my benchmarks for the problematic revisions:
revision parrot -C parrot -C -G
30125 41 seconds 6 seconds
30123 30 seconds 6 seconds
30122 8 seconds 6 seconds
changed files:
src/ops/core.ops, (r30123,r30125)
src/pmc/exception.pmc, (r30123)
src/sub.c (r30123)
t/op/exceptions.t (r30123)
What confuses me a lot is that r30125 change to core.ops just replaces
"string_from_literal" with "const_string" on one line. I thought that
const_string is faster because it doesn't need to allocate memory and
produces less garbage. But it happens that it is a lot slower - it adds
11 seconds to the execution (with GC enabled).
Best regards
luben