Alexandra <alku <at> imm.dtu.dk> writes: > > Dear all, > > I have encountered some strange things when creating lm objects in R: model > depends on the order of the terms specified in a formula. > Let us consider the following simple example: > > > dat <- expand.grid(A = factor(c("a1", "a2")), > + B = factor(paste("b", 1:4, sep="")), > + rep = factor(1:2)) > > set.seed(12345) > > dat$x <- 1:16 * .5 + runif(16) > > dat$Y <- rnorm(16) > > > dat
[snip] > > > logLik(m0 <- lm(Y ~ A:B + x:A, dat)) > 'log Lik.' -13.22186 (df=11) > > > logLik(m1 <- lm(Y ~ x:A + A:B, dat)) > 'log Lik.' -13.66822 (df=10) > > To me it is a bit strange that m0 and m1 models appear to have different > loglikelihood (only the order of the terms in a formula was changed!) > > My guess is that the problem lies in model.matrix: > > X1 <- model.matrix(~ x:A + A:B, dat) > X2 <- model.matrix(~ A:B + x:A, dat) > > ## number of columns: > ncol(X1) ## 9 > > > ncol(X2) ## 11 > > ## rank of design matrices: > qr(X1)$rank ## 9 > > qr(X2)$rank ## 10 I agree that this seems weird, and I haven't been able to figure it out yet. My best (not very well-informed) guess is that there is something going on with automatic dropping of terms that appear to be aliased?? and that this test is (perhaps unintentionally) order- dependent. I don't see anything obvious in the R code (model.matrix.default) that would be responsible, and I haven't had time (and probably won't) to go through the underlying C code (do_modelmatrix in src/main/model.c) to see what's going on. For what it's worth, the ultimate reference for how this stuff is *supposed* to work is (I think) the "White Book" (Statistical Models in S, Chambers and Hastie) -- I don't alas have a copy. I don't see any further references to model.matrix or 'model matrix' in the R manuals, other than a fairly short description in section 11 of the "Intro to R". In other words, I'm stumped too and I hope someone else can provide a more informed answer. Ben Bolker ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.