AFAIK, tapply() only works for one variable (apart from the grouping
variable). It might be perhaps better to use split() here:
df <- data.frame(ID = c(111, 111, 111, 178, 178, 138, 138, 138, 138),
value = c(5, 6, 2, 7, 3, 3, 8, 7, 6),
Seg = c(2, 2, 2, 4, 4, 1, 1, 1, 1) )
df.s <- split( df, df$ID )
out <- sapply( df.s, function(m){
c( mu=mean(m$value), var=var(m$value),
min=min(m$Seg), max=max(m$Seg) ) })
out <- t(out)
mu var min max
111 4.333333 4.333333 2 2
138 6.000000 4.666667 1 1
178 5.000000 8.000000 4 4
You could also have used range() here instead of calculating min and max
separately but naming the resulting columns becomes a bit tricky.
Regards, Adai
PS: If you do a dput() on a subset of the data, you can get a simple
reproducible example that other R users can easily read in.
Julia Liu wrote:
Adai,
Thank you so much for your help. I like your code the best. :) So simple. I have another question though, if you don't mind. I'd like to include another variable in "res". This variable defines the segmentation of each person (ranges, say, from 1 to 4).
ID value Seg
111 5 2
111 6 2
111 2 2
178 7 4
178 3 4
138 3 1
138 8 1
138 7 1
138 6 1How to do this? Thank you so much for the help.
Sincerely
Julia
--- On Thu, 9/11/08, Adaikalavan Ramasamy <[EMAIL PROTECTED]> wrote:
From: Adaikalavan Ramasamy <[EMAIL PROTECTED]>
Subject: Re: [R] Calculate mean/var by ID
To: "Jorge Ivan Velez" <[EMAIL PROTECTED]>
Cc: "liujb" <[EMAIL PROTECTED]>, r-help@r-project.org
Date: Thursday, September 11, 2008, 10:28 PM
A slight variation of what Jorge has proposed is:
f <- function(x) c( mu=mean(x), var=var(x) )
do.call( "rbind", tapply( df$value, df$ID, f ) )
mu var
111 4.333333 4.333333
138 6.000000 4.666667
178 5.000000 8.000000
Regards, Adai
Jorge Ivan Velez wrote:
Dear Julia,
Try also
x=read.table(textConnection("ID value
111 5
111 6
111 2
178 7
178 3
138 3
138 8
138 7
138 6"),header=TRUE)
closeAllConnections()
attach(x)
do.call(rbind,tapply(value,ID, function(x){
res=c(mean(x,na.rm=TRUE),var(x,na.rm=TRUE))
names(res)=c('Mean','Variance')
res
}
)
)
HTH,
Jorge
On Thu, Sep 11, 2008 at 1:45 PM, liujb <[EMAIL PROTECTED]> wrote:
Hello,
I have a data set that looks like this.
ID value
111 5
111 6
111 2
178 7
178 3
138 3
138 8
138 7
138 6
.
.
.
I'd like to calculate the mean and var for each object identified
by the
ID.
I can in theory just loop through the whole thing..., but is there a
easier
way/command which let me calculate the mean/var by ID?
Thanks,
Julia
--
View this message in context:
http://www.nabble.com/Calculate-mean-var-by-ID-tp19440461p19440461.html
Sent from the R help mailing list archive at Nabble.com.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.