Re: [R] aggregate.formula implicitly removes rows containing NA

Peter Ehlers Tue, 11 Jan 2011 17:15:23 -0800

On 2011-01-11 14:41, Dickison, Daniel wrote:

The documentation for `aggregate` makes it sound like aggregate.formula should behave 
identically to aggregate.data.frame (apart from the way the parameters are passed).  But 
it looks like aggregate.formula is quietly removing rows where any of the 
"output" variables (those on the LHS of the formula) are NA.  This differs from 
how aggregate.data.frame works.  Is this expected behavior?


Here are a couple of examples:

d<- data.frame(a=rep(1:2, each=2),

+                 b=c(1,2,NA,3))

aggregate(d["b"], d["a"], mean)

   a   b
1 1 1.5
2 2  NA

aggregate(b ~ a, d, mean)

   a   b
1 1 1.5
2 2 3.0

It's removing whole rows even if just one of the columns is NA, i.e.:

d<- data.frame(a=rep(1:2, each=2),

+                 b=c(1,2,NA,3),
+                 c=c(NA,2,3,NA))

aggregate(cbind(b,c) ~ a, d, mean)

   a b c
1 1 2 2

Daniel


Try setting na.acton = na.pass.

Peter Ehlers

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] aggregate.formula implicitly removes rows containing NA

Reply via email to