Dear R users:

All textbook references that I consult say that in a nested ANOVA (e.g., A/B), the F statistic for factor A should be calculated as

F_A = MS_A / MS_(B within A).

But when I run this simple example:

set.seed(1)
A <- factor(rep(1:3, each=4))
B <- factor(rep(1:2, 3, each=2))
Y <- rnorm(12)
anova(lm(Y ~ A/B))

I get this result:

  Analysis of Variance Table

  Response: Y
            Df Sum Sq Mean Sq F value Pr(>F)
  A          2 0.4735 0.23675  0.2845 0.7620
  A:B        3 1.7635 0.58783  0.7064 0.5823
  Residuals  6 4.9931 0.83218

Evidently, R calculates the F value for A as MS_A / MS_Residuals.

While it is straightforward enough to calculate what I think is the correct result from the table, I am surprised that R doesn't give me that answer directly. Does anybody know if R's behavior is intentional, and if so, why? Equally importantly, is there a straightforward way to make R give the answer I expect, that is:

     Df Sum Sq Mean Sq F value Pr(>F)
  A   2 0.4735 0.23675  0.4028 0.6999

The students in my statistics class would be much happier if they didn't have to type things like

  a <- anova(...)
  F <- a$`Sum Sq`[1] / a$`Sum Sq`[2]
  P <- 1 - pf(F, a$Df[1], a$Df[2])

(They are not R programmers (yet).) And to be honest, I would find it easier to read those results directly from the table as well.

Thanks,

Daniel Wagenaar

--
Daniel A. Wagenaar, PhD
Assistant Professor
Department of Biological Sciences
McMicken College of Arts and Sciences
University of Cincinnati
Cincinnati, OH 45221
Phone: +1 (513) 556-9757
Email: daniel.wagen...@uc.edu
Web: http://www.danielwagenaar.net

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to