Hi folks, I'm trying to figure out how to get summarized data based on multiple columns. However, instead of giving summaries for every combination of categorical columns, I want it for each value of each categorical column regardless of the other columns. I could do this with three different commands, but i'm wondering if there's a more elegant way that I'm missing. Thanks!
allie > my_df = data.frame(a = c(1,1,1,0,0,0), b=c(0,0,0,1,1,1), c=c(1,0,1,0,1,0), dat=c(10,11,12,13,14,15)) > my_df a b c dat 1 1 0 1 10 2 1 0 0 11 3 1 0 1 12 4 0 1 0 13 5 0 1 1 14 6 0 1 0 15 > # not what I want > ddply(my_df, .(a,b,c), function(x) c("mean"=mean(x$dat), "n"=nrow(x))) a b c mean n 1 0 1 0 14 2 2 0 1 1 14 1 3 1 0 0 11 1 4 1 0 1 11 2 What I want: a b c mean n 1 1 * * 11 3 2 * 1 * 14 3 3 * * 1 12 3 where "*" refers to any value of the other columns. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.