mercy!!! ;-)
thanks, everyone. sure beats me trying to reinvent a slower version of the wheel. came in very handy. I think it would be nice to see some of these pointers in the "?by" manual page. not sure who to ask to do this, but maybe this person reads r-help. /iaw ---- Ivo Welch (ivo.we...@brown.edu, ivo.we...@gmail.com) On Mon, Aug 30, 2010 at 5:23 PM, Gabor Grothendieck <ggrothendi...@gmail.com > wrote: > On Mon, Aug 30, 2010 at 3:54 PM, Dennis Murphy <djmu...@gmail.com> wrote: > > Hi: > > > > You've already gotten some good replies re aggregate() and plyr; here are > > two more choices, from packages doBy and data.table, plus the others for > > a contained summary: > > > > key <- c(1,1,1,2,2,2) > > val1 <- rnorm(6) > > indf <- data.frame( key, val1) > > outdf <- by(indf, indf$key, function(x) c(m=mean(x), s=sd(x)) ) > > outdf > > > > # Alternatives: > > > > # aggregate (base) with new formula interface > > > > # write a small function to return multiple outputs > > f <- function(x) c(mean = mean(x, na.rm = TRUE), sd = sd(x, na.rm = > TRUE)) > > > > aggregate(val1 ~ key, data = indf, FUN = f) > > key val1.mean val1.sd > > 1 1 -0.9783589 0.6378922 > > 2 2 0.2816016 1.4490699 > > > > # package doBy (get the same output) > > > > library(doBy) > > summaryBy(val1 ~ key, data = indf, FUN = f) > > key val1.mean val1.sd > > 1 1 -0.9783589 0.6378922 > > 2 2 0.2816016 1.4490699 > > > > # package plyr > > > > library(plyr) > > ddply(indf, .(key), summarise, mean = mean(val1), sd = sd(val1)) > > key mean sd > > 1 1 -0.9783589 0.6378922 > > 2 2 0.2816016 1.4490699 > > > > # package data.table > > > > library(data.table) > > indt <- data.table(indf) > > indt[, list(mean = mean(val1), sd = sd(val1)), by = > list(as.integer(key))] > > key mean sd > > [1,] 1 -0.9783589 0.6378922 > > [2,] 2 0.2816016 1.4490699 > > > > It's a cornucopia! :) Multiple grouping variables are no problem with > these > > functions, BTW. > > > > HTH, > > > And here are yet four more: > > > > > f.by <- function(x) c(key = x$key[1], mean = mean(x$val), sd = > sd(x$val)) > > do.call(rbind, by(indf, indf["key"], f.by)) > key mean sd > 1 1 0.006794852 0.3779713 > 2 2 0.251890650 0.4379315 > > > > library(sqldf) > > sqldf("select key, avg(val1) mean, stdev(val1) sd from indf group by > key") > key mean sd > 1 1 0.006794852 0.3779713 > 2 2 0.251890650 0.4379315 > > > > library(remix) > > remix(val1 ~ key, transform(indf, key = factor(key)), funs = c(mean, sd)) > val1 ~ key > ========== > > +-----+---+------+-------+------+ > | | mean | sd | > +=====+===+======+=======+======+ > | key | 1 | val1 | 0.01 | 0.38 | > + +---+------+-------+------+ > | | 2 | val1 | 0.25 | 0.44 | > +-----+---+------+-------+------+ > > > > library(Hmisc) > > summary(val1 ~ key, indf, fun = function(x) c(mean = mean(x), sd = > sd(x))) > val1 N=6 > > +-------+-+-+-----------+---------+ > | | |N|mean |sd.val1 | > +-------+-+-+-----------+---------+ > |key |1|3|0.006794852|0.3779713| > | |2|3|0.251890650|0.4379315| > +-------+-+-+-----------+---------+ > |Overall| |6|0.129342751|0.3897180| > +-------+-+-+-----------+---------+ > > -- > Statistics & Software Consulting > GKX Group, GKX Associates Inc. > tel: 1-877-GKX-GROUP > email: ggrothendieck at gmail.com > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.