Looks like the rank function in R takes up most of the time. Replacing it with a sum reduces the time of the rollapply solution to one sixth of the its original time:
> system.time(rollapply(z,len, function(x) rank(x)[len])) user system elapsed 17.00 0.27 17.27 > system.time(rollapply(z,len, function(x) sum(x <= x[len]))) user system elapsed 2.72 0.28 3.01 On Sun, Apr 18, 2010 at 11:51 AM, zerdna <az...@yahoo.com> wrote: > > Gabor, Charles, Whit -- i've been walking the woods of R alone so far, and i > got to say that your replies to that trivial question are eye-opening > experience for me. Gentlemen, what i am trying to say in a roundabout way is > that i am extremely grateful and that you guys are frigging awesome. > > Let me outline the times i am getting for different proposed solutions on > the same machine, same data, same version of R > > x<-rnorm(50000); len<-100 > > 1. my naive roll.rank > > system.time(x.rank.1<-roll.rank(x,len)) > user system elapsed > 6.405 0.488 6.94 > > 2. Gabor's zoo > > z<-zoo(x) > system.time(rollapply(z,len, function(x) rank(x)[len])) > user system elapsed > 6.195 0.361 6.554 > > 3. Charles embed > > system.time(x.rank <- rowSums(x[ -(1:(len-1)) ] >= embed(x,len) )) > user system elapsed > 0.181 0.055 0.236 > > > 4. Whit's fts > dat<-fts(x) > system.time(x.rank<-moving.rank(dat, len)) > user system elapsed > 0.036 0 0.036 > > 5. Charles suggestion with inline, my crude implementation > > sig<-signature(x="numeric", rank="integer", n="integer", len="integer") > code<-"int k=0; for(int i=*len-1; i< *n; i++) {int r=1; for(int j=i-1; j> > i-len;j--) r+=(x[i]>x[j] ?1:0); rank[k++]<-r;}" > fns<-cfunction(sig,code, convention=".C") > > system.time( x.rank<-fns(x, numeric(length(x)-len), length(x), len)) > > user system elapsed > 0.011 0 0.011 > > > I guess i could speed it up from time being proportional to length(x)*len > to time proportional to length(x)*log(len) if i use slightly more > intelligent algo, but this works fine for my requirements. Only thing i > really wonder about is why exactly R takes 640 times more than this C code. > It would be immensely enlightening if someone could point to an explanation > of how execution in R works and where and when it slows down like this. > -- > View this message in context: > http://n4.nabble.com/efficient-rolling-rank-tp2013535p2014922.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.