Hi Matthew, On Thu, Apr 29, 2010 at 9:52 AM, Matthew Dowle <mdo...@mdowle.plus.com> wrote: > I don't know about that, but try this : > > install.packages("data.table", repos="http://R-Forge.R-project.org") > require(data.table) > summaries = data.table(summaries) > summaries[,sum(counts),by=symbol] > > Please let us know if that returns the correct result, and if its > memory/speed is ok ?
Thanks for directing me to the data.table package. I read through some of the vignettes, and it looks quite nice. While your sample code would provide answer if I wanted to just compute some summary statistic/function of groups of my data.frame (using `by=symbol`), what's the best way to produces several pieces of info per subset. For instance, I see that I can do something like this: summaries[, list(counts=sum(counts), width=sum(exon.width)), by=symbol] But what if I need to do some more complex processing within the subsets defined in `by=symbol` -- like several lines of programming logic for 1 result, say. I guess I can open a new block that just returns a data.table? Like: summaries[, { cnts <- sum(counts) ew <- sum(exon.width) # ... some complex things complex <- # .. result of complex things data.table(counts=cnts, width=ew, cplx=complex) }, by=symbol] Is that right? (I mean, it looks like it's working, but maybe there's a more idiomatic way(?)) -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.