I ran into a similar issue with a simple benchmark the other day,
where a plain loop in Lua was faster than vectorised code in R ...
hmm, would you be saying that r's vectorised performance is overhyped?
or is it just that non-vectorised code in r is slow?
What I meant, I guess, was (apart from a little bit of trolling) that
I'd had misconceptions about the speed differences between loops and
vectorised code. In particular, I had entertained the naive belief
that vectorised solutions are always highly efficient (I wonder if I'm
the only one who was naive enough to think this ..), so I was very
much surprised to find a loop in an interpreted language like Lua to
be faster than vectorised R code.
My silly little benchmark translated the Lua code
sum = 0
for i=1,N do sum = sum + i end
into vectorised R
sum(as.numeric(1:N))
The performance results were as follows:
for loop in R: 0.75 Mops/s (2000000 ops in 2.66 s)
vectorised R: 29.75 Mops/s (50000000 ops in 1.68 s)
Lua: 51.54 Mops/s (100000000 ops in 1.94 s)
Perl: 8.26 Mops/s (10000000 ops in 1.21 s)
Note that Lua is an interpreted language (compiled to byte code); with
the just in time compiler I get more than 230 Mops/s.
I suspect that this has to do with cache trashing, since the
vectorised code in R operates on large vectors that have to be read
from / written to RAM, while the Lua loop presumably runs entirely
from the L1 cache. (Before you ask, I split the vectorised R code
into a loop that processes 1 million numbers at a time; I tried
different ways of coding the benchmark and picked the fastest solution.)
Perhaps loops in R aren't always as slow (compared to matrix
operations) as one seemed to think.
depends how and where you use them. in the problem discussed here,
they
do slow down the code for some class of inputs and do not speedup for
the others, compared to the array version of pat.
My mistake was to think that vectorisation will always give a
substantial performance boost and that for-loops should be avoided
whenever possible. But it's really just the inner loops that need to
be vectorised: iterating over the outer margins of a matrix doesn't
add much overhead, especially if the vectorised solution would have to
operate on huge matrices.
Guess that's a bad habit from my old Matlab days (back in the early
90s) ...
Best,
Stefan
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.