I'll put in a plug for vapply(). > # 100,000 numbers in 17576 groups: > y <- rep(do.call(paste, c(list(sep=""), expand.grid(LETTERS,letters,letters))), length=1e5) > x <- seq_along(y)^2 > system.time(val.vapply <- vapply(split(x, y), FUN=sum, FUN.VALUE=0)) user system elapsed 0.18 0.02 0.20 > system.time(val.rowsum <- rowsum(x, y)) user system elapsed 0.14 0.00 0.15 > system.time(val.tapply <- tapply(x, y, sum)) user system elapsed 0.40 0.00 0.41 > all(val.vapply==val.rowsum) [1] TRUE > all(val.vapply==val.tapply) [1] TRUE
S+ has fast functions groupSums, groupProds, etc. (one for each of the standard summary functions) to deal with this sort of thing. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -----Original Message----- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On > Behalf Of Bert Gunter > Sent: Wednesday, August 31, 2011 10:10 AM > To: Henrique Dallazuanna > Cc: r-help; zhenjiang xu > Subject: Re: [R] how to create data.frames from vectors with duplicates > > For the record, Henrique's use of rowsum() is about 10 times faster > than using tapply (and presumably anything with table() ) on my > computer. It call a C primitive. > > -- Bert > > On Wed, Aug 31, 2011 at 9:55 AM, Henrique Dallazuanna <www...@gmail.com> > wrote: > > Try this: > > > > rowsum(x, y) > > > > On Wed, Aug 31, 2011 at 1:45 PM, zhenjiang xu <zhenjiang...@gmail.com> > > wrote: > >> > >> Hi R users, > >> > >> suppose I have two vectors, > >> > x=c(1,2,3,4,5) > >> > y=c('a','b','c','a','c') > >> How can I get a data.frame like this? > >> > xy > >> count > >> a 5 > >> b 2 > >> c 8 > >> > >> I know a few ways to fulfill the task. However, I have a huge number > >> of this kind calculations, so I'd like an efficient solution. Thanks > >> > >> -- > >> Best, > >> Zhenjiang > >> > >> ______________________________________________ > >> R-help@r-project.org mailing list > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide > >> http://www.R-project.org/posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. > > > > > > > > -- > > Henrique Dallazuanna > > Curitiba-Paraná-Brasil > > 25° 25' 40" S 49° 16' 22" O > > > > ______________________________________________ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.