perfect. this is the R way to do it quick and easy. thank you, marc. (PS, in my earlier example, what I wanted was aggregate( . ~ key, data=indf, FUN = function(x) c(m=mean(x), s=sd(x))) )
---- Ivo Welch (ivo.we...@brown.edu, ivo.we...@gmail.com) On Mon, Aug 30, 2010 at 10:47 AM, Marc Schwartz <marc_schwa...@me.com> wrote: > > FYI, since R version 2.11.0, aggregate() can return a vector of summary > results, rather than just a scalar: > >> aggregate(iris$Sepal.Length, list(Species = iris$Species), > function(x) c(Mean = mean(x), SD = sd(x))) > Species x.Mean x.SD > 1 setosa 5.0060000 0.3524897 > 2 versicolor 5.9360000 0.5161711 > 3 virginica 6.5880000 0.6358796 > > > There is also now a formula interface: > >> aggregate(. ~ Species, data = iris, > FUN = function(x) c(Mean = mean(x), SD = sd(x))) > Species Sepal.Length.Mean Sepal.Length.SD Sepal.Width.Mean > 1 setosa 5.0060000 0.3524897 3.4280000 > 2 versicolor 5.9360000 0.5161711 2.7700000 > 3 virginica 6.5880000 0.6358796 2.9740000 > Sepal.Width.SD Petal.Length.Mean Petal.Length.SD Petal.Width.Mean > 1 0.3790644 1.4620000 0.1736640 0.2460000 > 2 0.3137983 4.2600000 0.4699110 1.3260000 > 3 0.3224966 5.5520000 0.5518947 2.0260000 > Petal.Width.SD > 1 0.1053856 > 2 0.1977527 > 3 0.2746501 > > > HTH, > > Marc Schwartz > > > On Aug 30, 2010, at 8:36 AM, Henrique Dallazuanna wrote: > >> Try this: >> >> as.data.frame(by( indf, indf$charid, function(x) c(m=mean(x), s=sd(x)) )) >> >> On Mon, Aug 30, 2010 at 10:19 AM, ivo welch <ivo.we...@gmail.com> wrote: >> >>> dear R experts: >>> >>> has someone written a function that returns the results of by() as a >>> data frame? of course, this can work only if the output of the >>> function that is an argument to by() is a numerical vector. >>> presumably, what is now names(byobject) would become a column in the >>> data frame, and the by object's list elements would become columns. >>> it's a little bit like flattening the by() output object (so that the >>> name of the list item and its contents become the same row), and >>> having the right names for the columns. I don't know how to do this >>> quickly in the R way. (Doing it slowly, e.g., with a for loop over >>> the list of vectors, is easy, but would not make a nice function for >>> me to use often.) >>> >>> for example, lets say my by() output is currently >>> >>> by( indf, indf$charid, function(x) c(m=mean(x), s=sd(x)) ) >>> >>> $`A` >>> [1] 2 3 >>> $`B` >>> [2] 4 5 >>> >>> then the revised by() would instead produce >>> >>> charid m s >>> A 2 3 >>> B 4 5 >>> >>> working with data frames is often more intuitive than working with the >>> output of by(). the R wizards are probably chuckling now about how >>> easy this is... >>> >>> regards, >>> >>> /iaw >>> >>> ---- >>> Ivo Welch (ivo.we...@brown.edu, ivo.we...@gmail.com) > > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.