Hi this one could be slightly quicker but i am not completely sure because it gives me different results from yours
set.seed(111) x<-x1<-x2<- data.frame(a=sample(1:50, 10000, replace=T), b=sample(100:500, 10000, replace=T)) y<- data.frame(a=sample(1:50, 10000, replace=T), b=sample(100:500, 10000, replace=T)) system.time({ one<-paste(x[,1], x[,2], sep=".") two<-paste(y[,1], y[,2], sep=".") tab<-table(two[two %in% one]) x1[match(names(tab), one),3]<-tab }) system.time({ z <- merge(x, y, by=c("a", "b"), all.x=TRUE) x2<-t(sapply(split(z, z[,1:2], drop=TRUE), function(.grp){ if (any(is.na(.grp))) return(c(.grp[1,1], .grp[1,2], 0)) c(.grp[1,1], .grp[1,2], nrow(.grp)) })) }) head(x2) head(x1[order(x1[,1], x1[,2]),]) Regards Petr r-help-boun...@r-project.org napsal dne 13.04.2009 21:51:18: > try this: > > > x > V1 V2 V3 > 1 5000000 3200000 0 > 2 5100000 3100000 0 > 3 5200000 3100000 0 > 4 5200000 3200000 0 > > y > V1 V2 V3 > 1 5000000 3200000 1 > 2 5000000 3200000 1 > 3 5200000 3100000 1 > 4 5200000 3000000 1 > > z <- merge(x, y, by=c("V1", "V2"), all.x=TRUE) > > t(sapply(split(z, z[,1:2], drop=TRUE), function(.grp){ > + if (any(is.na(.grp))) return(c(.grp[1,1], .grp[1,2], 0)) > + c(.grp[1,1], .grp[1,2], nrow(.grp)) > + })) > [,1] [,2] [,3] > 5100000.3100000 5100000 3100000 0 > 5200000.3100000 5200000 3100000 1 > 5000000.3200000 5000000 3200000 2 > 5200000.3200000 5200000 3200000 0 > > > > > On Mon, Apr 13, 2009 at 1:06 PM, Chris82 <rubenba...@gmx.de> wrote: > > > > Hi R-users, > > > > I create a simple code to check out how often the same numbers in y occur in > > x. For example 5000000 320000 occurs two times. > > But the code with the loop is extremly slow. x have 6100 lines and y > > sometimes more than 50000 lines. > > > > Is there any alternative code to create with R? > > > > thanks. > > > > > > x > > > > 5000000 3200000 0 > > 5100000 3100000 0 > > 5200000 3100000 0 > > 5200000 3200000 0 > > > > > > lengthx <- length(x[,1]) > > > > y > > > > 5000000 3200000 1 > > 5000000 3200000 1 > > 5200000 3100000 1 > > 5200000 3000000 1 > > > > > > langthy <- length(y[,1]) > > > > for (i in 1:lengthx){ > > for (j in 1:lengthy){ > > if (x[i,1] == y[j,1]){ > > if (x[i,2] == y[j,2]){ > > x[i,3] <- x[i,3] + 1 > > } > > } > > } > > } > > x > > > > 1 5000000 3200000 2 > > 2 5100000 3100000 0 > > 3 5200000 3100000 1 > > 4 5200000 3200000 0 > > -- > > View this message in context: http://www.nabble.com/Faster-Solution-for-a- > simple-code--tp23024985p23024985.html > > Sent from the R help mailing list archive at Nabble.com. > > > > ______________________________________________ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > > > -- > Jim Holtman > Cincinnati, OH > +1 513 646 9390 > > What is the problem that you are trying to solve? > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.