Re: [R] data.frame and formula classes of aggregate

David Winsemius Mon, 29 Nov 2010 06:50:49 -0800


On Nov 29, 2010, at 9:35 AM, David Freedman wrote:

Hi - I apologize for the 2nd post, but I think my question from afew weeks
ago may have been overlooked on a Friday afternoon.
I might be missing something very obvious, but is it widely knownthat theaggregate function handles missing values differently depending if adata
frame or a formula is the first argument ?

I'm not sure if it is widely known, but it is certainly suggested bythe documentation for aggregate, since aggregate.data.frame hasdifferent defaults than aggregate.formula. See the Usage section atthe very top of ?aggregate.

 For example,

(d<- data.frame(sex=rep(0:1,each=3),
wt=c(100,110,120,200,210,NA),ht=c(10,20,NA,30,40,50)))
x1<- aggregate(d, by = list(d$sex), FUN = mean);
        names(x1)[3:4]<- c('mean.dfcl.wt','mean.dfcl.ht')
x2<- aggregate(cbind(wt,ht)~sex,FUN=mean,data=d);
        names(x2)[2:3]<- c('mean.formcl.wt','mean.formcl.ht')
cbind(x1,x2)[,c(2,3,6,4,7)]
The output from the data.frame class has an NA if there are missingvaluesin the group for the variable with missing values. But, the formulaclassoutput seems to delete the entire row (missing and non-missingvalues) ifthere are any NAs. Wouldn't one expect that the 2 forms (data framevs
formula) of aggregate would give the same result?

thanks very much
david freedman, atlanta

--

David Winsemius, MD
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] data.frame and formula classes of aggregate

Reply via email to