This solution also seems to be the fastest of the proposed options for
this data set:
library("rbenchmark")
benchmark(columns = c("test", "elapsed", "relative"), order = "elapsed",
apply =apply(iris[, -5], 2, tapply, iris$Species, mean),
with = with(iris, rowsum(iris[, -5], Species)/table(Species)),
aggregate = aggregate(iris[,-5],list(iris[,5]),mean),
sapply = sapply(split(iris[,1:4], iris$Species), mean))
# 4 sapply 0.148 1.000000
# 1 apply 0.248 1.675676
# 2 with 0.310 2.094595
# 3 aggregate 0.313 2.114865
However, the 'with/rowsum/table' option proposed by Bill Venables
appears to scale better:
i <- rbind(iris, iris, iris, iris, iris)
i <- rbind(i, i, i, i, i); i <- rbind(i, i, i, i, i); i <- rbind(i, i,
i, i, i)
NROW(i)
# [1] 93750
benchmark(columns=c("test", "elapsed", "relative"), order="elapsed",
apply=apply(i[, -5], 2, tapply, i$Species, mean),
with=with(i, rowsum(i[, -5], Species)/table(Species)),
aggregate=aggregate(i[,-5],list(i[,5]),mean),
sapply=sapply(split(i[,1:4], i$Species), mean))
# test elapsed relative
# 2 with 2.708 1.000000
# 4 sapply 5.189 1.916174
# 3 aggregate 15.990 5.904727
# 1 apply 31.646 11.686115
(Because I care about these things...)
Allan
On 10/06/10 09:44, Petr PIKAL wrote:
Hi
split/sapply can be used besides other options
sapply(split(iris[,1:4], iris$Species), mean)
Regards
Petr
r-help-boun...@r-project.org napsal dne 10.06.2010 00:43:29:
Hi there:
I have a question about generating mean value of a data.frame. Take
iris data for example, if I have a data.frame looking like the
following:
---------------------
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.1 3.5 1.4
0.2 setosa
2 4.9 3.0 1.4
0.2 setosa
3 4.7 3.2 1.3
0.2 setosa
. . . .
. .
. . . .
. .
. . . .
. .
-----------------------
There are three different species in this table. I want to make a table
and
calculate mean value for each specie as the following table:
-----------------
Sepal.Length Sepal.Width Petal.Length
Petal.Width
mean.setosa 5.006 3.428 1.462
0.246
mean.versicolor 5.936 2.770 4.260
1.326
mean.virginica 6.588 2.974 5.552
2.026
-----------------
Is there any short syntax can do it?? I mean shorter than the code I
wrote
as following:
attach(iris)
mean.setosa<-mean(iris[Species=="setosa", 1:4])
mean.versicolor<-mean(iris[Species=="versicolor", 1:4])
mean.virginica<-mean(iris[Species=="virginica", 1:4])
data.mean<-rbind(mean.setosa, mean.versicolor, mean.virginica)
detach(iris)
------------------
Thanks a million!!!
--
=====================================
Shih-Hsiung, Chou
System Administrator / PH.D Student at
Department of Industrial Manufacturing
and Systems Engineering
Kansas State University
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.