Would someone be so kind as to explain in English what the ANOVA code 
(anova.lm) is doing? I am having a hard time reconciling what the text books 
have as a brute force regression and the formula algorithm in 'R'. Specifically 
I see:

    p <- object$rank
    if (p > 0L) {
        p1 <- 1L:p
        comp <- object$effects[p1]
        asgn <- object$assign[object$qr$pivot][p1]
        nmeffects <- c("(Intercept)", attr(object$terms, "term.labels"))
        tlabels <- nmeffects[1 + unique(asgn)]
        ss <- c(unlist(lapply(split(comp^2, asgn), sum)), ssr)
        df <- c(unlist(lapply(split(asgn, asgn), length)), dfr)
    }
    else {
        ss <- ssr
        df <- dfr
        tlabels <- character(0L)
    }
    ms <- ss/df
    f <- ms/(ssr/dfr)
    P <- pf(f, df, dfr, lower.tail = FALSE)
 

I think I understand the check for 'p' being non-zero. 'p' is essentially the 
number of terms in the model matrix (including the intercept term if it 
exists). So in a mathematical description of a regression that included the 
intercept and one term (like dist ~ speed) you would have a model matrix of a 
column of '1's and then a column of data. The 'assign' would be a vector 
containing [0,1]. So then in finding the degrees of freedom you split the 
asssign matrix with itself. I am having a hard time seeing that this ever 
produces degrees of freedom that are different. So I get that the vector 'df' 
would always be something like [2,2,dfr]. But that is obviously wrong. Would 
someone care to elighten me on what the code above is doing?

Thank you.

Kevin

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to