On 9/30/2005 1:41 PM, hadley wickham wrote: > I'm not entirely sure what you want, but maybe this does the trick? > > data.frame.by <- function(data, variables, fun, ...) { > if (length(variables) == 0 ) { > df <- data.frame(results = 0) > df$results <- list(fun(data$value, ...)) > return(df) > } > > sorted <- sort.df(data, variables)[,c(variables), drop=FALSE] > duplicates <- duplicated(sorted[,variables, drop=FALSE]) > index <- cumsum(!duplicates) > > results <- by(data, index, fun, ...) > > cols <- sorted[!duplicates,variables, drop=FALSE] > cols$results <- array(results) > cols > } > > > sort.df <- function(data, vars) { > data[do.call("order", data[,vars, drop=FALSE]), ,drop=FALSE] > } > > > dataset <- data.frame(gp1 = rep(1:2, c(4,4)), gp2 = rep(1:4, > c(2,2,2,2)), value = rnorm(8)) > > data.frame.by(dataset, c("gp1", "gp2"), function(data) mean(data$value)) > data.frame.by(dataset, "gp1", function(data) tapply(data$value, data$gp2, > mean)) > data.frame.by(dataset, "gp1", function(data) lm(gp2 ~ value, data)) # > doesn't print, but everything is there ok > > (note that the results column will be a list if necessary - this may > be a serious abuse of data frames, but I'm not sure and no one replied > when I queried the list)
I think this should work. Thanks! Duncan Murdoch ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel