Re: [R] R badly lags matlab on performance?

luke Mon, 05 Jan 2009 14:10:38 -0800

On Sun, 4 Jan 2009, Stavros Macrakis wrote:

On Sun, Jan 4, 2009 at 4:50 PM,  <l...@stat.uiowa.edu> wrote:

On Sun, 4 Jan 2009, Stavros Macrakis wrote:

On Sat, Jan 3, 2009 at 7:02 PM,  <l...@stat.uiowa.edu> wrote:

R's interpreter is fairly slow due in large part to the allocation of
argument lists and the cost of lookups of variables,


I'd think another problem is call-by-need.  I suppose inlining or
batch analyzing groups of functions helps there.


Yes.  The overhead can probably be reduced at least in compiled code,
but it will always be significant. Many primitives are strict and do
not depend on call stack position so inlinign is safe and that is done
in the current compiler.  Figuring out whether inlining is safe for
user functions is more problematic and may need declarations.

including ones like [<- that are assembled and looked up as strings on every 
call.

Wow, I had no idea the interpreter was so awful. Just some simple tree-to-tree 
transformations would speed things up, I'd think, e.g. `<-`(`[`(...), ...) ==> 
`<-[`(...,...).

'Awful' seems a bit strong.


Well, I haven't looked at the code, but if I'm interpreting "assembled
and looked up as strings on every call" correctly, this means taking
names, expanding them to strings, concatenating them, re-interning
them, then looking up the value.


That's about it as I recall.

 That sounds pretty awful to me both
in the sense of being inefficient and of being ugly.


Ugly: a matter of taste and opinion. Inifficient: yes, but in the
context of the way the rest of the computation is done it is simple
and efficient enough (no point in optimizing this given the other
issues at this point).  It doesn't make the interpreter awful, which
is what you said.

I'd think that one of the challenges will be the dynamic types --...

I am for now trying to get away without declarations and pre-testing
for the best cases before passing others off to the current internal
code.


Have you considered using Java bytecodes and taking advantage of
dynamic compilers like Hotspot?  They often do a good job in cases
like this by assuming that types are fairly predictable from one run
to the next of a piece of code.  Or is the Java semantic model too
different?


My sense at this point is that this isn't a particularly good match,
in particular as one of my objectives is to try to take advantage of
opportunities for computing some compound numerical operations on
vectors in parallel.  But the possibility of translating the R byte
code to C, JVM, .Net, etc is something I'm trying to keep in mind.

luke

...There is always a trade-off in complicating the code and the consequences 
for maintainability that implies.


Agreed entirely!

A 1.5 factor difference here I find difficult to get excited about, but it 
might be worth a look.


I agree. The 1.5 isn't a big deal at all.

          -s


--
Luke Tierney
Chair, Statistics and Actuarial Science
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:             319-335-3386
Department of Statistics and        Fax:               319-335-3017
   Actuarial Science
241 Schaeffer Hall                  email:      l...@stat.uiowa.edu
Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R badly lags matlab on performance?

Reply via email to