This solution also seems to be the fastest of the proposed options for this data set:

library("rbenchmark")
benchmark(columns = c("test", "elapsed", "relative"), order = "elapsed",
          apply =apply(iris[, -5], 2, tapply, iris$Species, mean),
          with = with(iris, rowsum(iris[, -5], Species)/table(Species)),
          aggregate = aggregate(iris[,-5],list(iris[,5]),mean),
          sapply = sapply(split(iris[,1:4], iris$Species), mean))
# 4    sapply   0.148 1.000000
# 1     apply   0.248 1.675676
# 2      with   0.310 2.094595
# 3 aggregate   0.313 2.114865

However, the 'with/rowsum/table' option proposed by Bill Venables appears to scale better:

i <- rbind(iris, iris, iris, iris, iris)
i <- rbind(i, i, i, i, i); i <- rbind(i, i, i, i, i); i <- rbind(i, i, i, i, i)
NROW(i)
# [1] 93750
benchmark(columns=c("test", "elapsed", "relative"), order="elapsed",
          apply=apply(i[, -5], 2, tapply, i$Species, mean),
          with=with(i, rowsum(i[, -5], Species)/table(Species)),
          aggregate=aggregate(i[,-5],list(i[,5]),mean),
          sapply=sapply(split(i[,1:4], i$Species), mean))
#        test elapsed  relative
# 2      with   2.708  1.000000
# 4    sapply   5.189  1.916174
# 3 aggregate  15.990  5.904727
# 1     apply  31.646 11.686115

(Because I care about these things...)

Allan

On 10/06/10 09:44, Petr PIKAL wrote:
Hi

split/sapply can be used besides other options

sapply(split(iris[,1:4], iris$Species), mean)

Regards
Petr

r-help-boun...@r-project.org napsal dne 10.06.2010 00:43:29:

Hi there:
      I have a question about generating mean value of a data.frame. Take
iris data for example, if I have a data.frame looking like the
following:
---------------------
     Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
1                    5.1               3.5                  1.4
     0.2     setosa
2                    4.9               3.0                  1.4
     0.2     setosa
3                    4.7               3.2                   1.3
    0.2     setosa
.                         .                   .                      .
              .              .
.                         .                   .                      .
             .               .
.                         .                   .                      .
             .               .
-----------------------
There are three different species in this table. I want to make a table
and
calculate mean value for each specie as the following table:

-----------------
                              Sepal.Length Sepal.Width Petal.Length
Petal.Width
mean.setosa                    5.006            3.428             1.462
       0.246
mean.versicolor               5.936             2.770             4.260
       1.326
mean.virginica                  6.588            2.974             5.552
       2.026
-----------------
Is there any short syntax can do it?? I mean shorter than the code I
wrote
as following:

attach(iris)
mean.setosa<-mean(iris[Species=="setosa", 1:4])
mean.versicolor<-mean(iris[Species=="versicolor", 1:4])
mean.virginica<-mean(iris[Species=="virginica", 1:4])
data.mean<-rbind(mean.setosa, mean.versicolor, mean.virginica)
detach(iris)
------------------

Thanks a million!!!


--
=====================================
Shih-Hsiung, Chou
System Administrator / PH.D Student at
Department of Industrial Manufacturing
and Systems Engineering
Kansas State University

    [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to