On Sun, 18 Apr 2010, zerdna wrote:


Gabor, Charles, Whit -- i've been walking the woods of R alone so far, and i
got to say that your replies to that trivial question are eye-opening
experience for me. Gentlemen, what i am trying to say in a roundabout way is
that i am extremely grateful and that you guys are frigging awesome.

Let me outline the times i am getting for different proposed solutions on
the same machine, same data, same version of R

x<-rnorm(50000); len<-100

1. my naive  roll.rank

system.time(x.rank.1<-roll.rank(x,len))
user    system    elapsed
6.405   0.488       6.94

2. Gabor's zoo

z<-zoo(x)
system.time(rollapply(z,len, function(x) rank(x)[len]))
user    system    elapsed
6.195    0.361      6.554

3. Charles embed

system.time(x.rank <- rowSums(x[ -(1:(len-1)) ] >= embed(x,len) ))
user    system    elapsed
0.181    0.055      0.236


4. Whit's fts
dat<-fts(x)
system.time(x.rank<-moving.rank(dat, len))
user    system    elapsed
0.036      0           0.036

5. Charles suggestion with inline, my crude implementation

sig<-signature(x="numeric", rank="integer", n="integer", len="integer")
code<-"int k=0; for(int i=*len-1; i< *n; i++) {int r=1; for(int j=i-1; j>
i-len;j--) r+=(x[i]>x[j] ?1:0); rank[k++]<-r;}"
fns<-cfunction(sig,code, convention=".C")

system.time( x.rank<-fns(x, numeric(length(x)-len), length(x), len))

user    system    elapsed
0.011    0               0.011


I guess i could speed it up from time being proportional  to length(x)*len
to time proportional to length(x)*log(len) if i use slightly more
intelligent algo, but this works fine for my requirements. Only thing i
really wonder about is why  exactly R takes 640 times more than this C code.
It would be immensely enlightening if someone could point to an explanation
of how execution in R works and where and when it slows down like this.

Well, you can always read the source code.

But short of that see

        ?Rprof

then try stuff like this:

x <- rnorm(50000)
len <- 100
Rprof()
x.rank <- rowSums(x[ -(1:(len-1)) ] >= embed(x,len) )
Rprof(NULL)
summaryRprof()
$by.self
              self.time self.pct total.time total.pct
embed              0.10     31.2       0.22      68.8
=                 0.08     25.0       0.08      25.0
+                  0.06     18.8       0.06      18.8
-                  0.04     12.5       0.04      12.5
rowSums            0.02      6.2       0.32     100.0
rep.int            0.02      6.2       0.02       6.2
inherits           0.00      0.0       0.30      93.8
is.data.frame      0.00      0.0       0.30      93.8

$by.total
              total.time total.pct self.time self.pct
rowSums             0.32     100.0      0.02      6.2
inherits            0.30      93.8      0.00      0.0
is.data.frame       0.30      93.8      0.00      0.0
embed               0.22      68.8      0.10     31.2
=                  0.08      25.0      0.08     25.0
+                   0.06      18.8      0.06     18.8
-                   0.04      12.5      0.04     12.5
rep.int             0.02       6.2      0.02      6.2

$sampling.time
[1] 0.32

HTH,

Chuck


--
View this message in context: 
http://n4.nabble.com/efficient-rolling-rank-tp2013535p2014922.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Charles C. Berry                            (858) 534-2098
                                            Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu               UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to