AFAIK, tapply() only works for one variable (apart from the grouping variable). It might be perhaps better to use split() here:

   df <- data.frame(ID = c(111, 111, 111, 178, 178, 138, 138, 138, 138),
                    value = c(5, 6, 2, 7, 3, 3, 8, 7, 6),
                    Seg = c(2, 2, 2, 4, 4, 1, 1, 1, 1) )

   df.s <- split( df, df$ID )

   out <- sapply( df.s, function(m){
                    c( mu=mean(m$value), var=var(m$value),
                       min=min(m$Seg), max=max(m$Seg) ) })
   out <- t(out)
             mu      var min max
   111 4.333333 4.333333   2   2
   138 6.000000 4.666667   1   1
   178 5.000000 8.000000   4   4

You could also have used range() here instead of calculating min and max separately but naming the resulting columns becomes a bit tricky.

Regards, Adai

PS: If you do a dput() on a subset of the data, you can get a simple reproducible example that other R users can easily read in.



Julia Liu wrote:
Adai,

Thank you so much for your help. I like your code the best. :) So simple. I have another question though, if you don't mind. I'd like to include another variable in "res". This variable defines the segmentation of each person (ranges, say, from 1 to 4). ID value Seg
111     5      2
111     6      2
111     2      2
178     7      4
178     3      4
138     3      1
138     8      1
138     7      1
138     6      1How to do this? Thank you so much for the help.
Sincerely
Julia

--- On Thu, 9/11/08, Adaikalavan Ramasamy <[EMAIL PROTECTED]> wrote:
From: Adaikalavan Ramasamy <[EMAIL PROTECTED]>
Subject: Re: [R] Calculate mean/var by ID
To: "Jorge Ivan Velez" <[EMAIL PROTECTED]>
Cc: "liujb" <[EMAIL PROTECTED]>, r-help@r-project.org
Date: Thursday, September 11, 2008, 10:28 PM

A slight variation of what Jorge has proposed is:

    f <- function(x) c( mu=mean(x), var=var(x) )

    do.call( "rbind", tapply( df$value, df$ID, f ) )

             mu      var
   111 4.333333 4.333333
   138 6.000000 4.666667
   178 5.000000 8.000000

Regards, Adai



Jorge Ivan Velez wrote:
Dear Julia,
Try also

x=read.table(textConnection("ID    value
111     5
111     6
111     2
178     7
178     3
138     3
138     8
138     7
138     6"),header=TRUE)
 closeAllConnections()
attach(x)

do.call(rbind,tapply(value,ID, function(x){
res=c(mean(x,na.rm=TRUE),var(x,na.rm=TRUE))
names(res)=c('Mean','Variance')
res
}
)
)

HTH,

Jorge




On Thu, Sep 11, 2008 at 1:45 PM, liujb <[EMAIL PROTECTED]> wrote:

Hello,

I have a data set that looks like this.
ID    value
111     5
111     6
111     2
178     7
178     3
138     3
138     8
138     7
138     6
.
.
.

I'd like to calculate the mean and var for each object identified
by the
ID.
I can in theory just loop through the whole thing..., but is there a
easier
way/command which let me calculate the mean/var by ID?

Thanks,
Julia
--
View this message in context:

http://www.nabble.com/Calculate-mean-var-by-ID-tp19440461p19440461.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.






______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to