Actually the memory foot print can be reduced by re-writting the function so
that it makes use of the
kronecker, rather than the outer function.
Note that you do not get 2D map as before ...
> countinstance2<-function(v,p) {
+ sapply(p,function(x,y) sum(kronecker(x,y,"==")),v)
+ }
> v<-sample(1:10,size=10,replace=TRUE)
> p<-2:4
> v
[1] 3 8 1 2 3 5 6 5 5 1
> countinstance2(v,p)
[1] 1 2 0
>
From: [email protected]
To: [email protected]
Subject: RE: [R] Count matches of a sequence in a vector?
Date: Thu, 22 Apr 2010 02:25:53 +0300
It may be possible to give a solution without a single for loop.
set.seed(1)
v<-sample(1:10,size=1e6,replace=TRUE)
p<-2:4
countinstance<-function(v,p) {
res<-outer(v,p,FUN="==");
apply(res,2,sum)
}
> system.time(replicate(50,countinstance(v,p)))/50
user system elapsed
0.2146 0.0248 0.2403
It is ~50% slower than the "f2" solution given previously ... but it also gives
you a 2D of where the matches are.
These are stored in the "res" variable; use with care with very big datasets.
I wonder whether it is possible to reduce the memory footprint with bit level
operations ...
Christos Argyropoulos
Hotmail: Free, trusted and rich email service. Get it now.
_________________________________________________________________
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.