On Wed, 2007-12-12 at 02:05 -0800, Mark Difford wrote: > Dear List-members, > > Hopefully someone will help through my confusion: > > In order to get the same coefficients as we get from the following > > ## > require (MASS) > summary ( lm(Gas ~ Insul/Temp - 1, data = whiteside) ) > > ...................... > > we need to do the following (if we use model.matrix to specify the model) > > ## > summary ( lm(Gas ~ model.matrix(~ Insul/Temp - 1) - 1, data = whiteside) ) > > ...................... > > That is, we need to take out "two intercepts." Is this "correct"?
Yes - if you insist on doing things that way > > Shouldn't lm check to see if an intercept has been requested as part of the > model formula? It does, (i.e. whether the user specified -1 in the formula argument), but you specified it inside model.matrix() so the formula parsing code never sees it. One wouldn't want lm to need to know about all functions that have/could ever be written in R, past or future, to know how to divine that here you wanted "-1" to mean "remove intercept please Mr. R Parser" and not "subtract 1 from this result please Mr. R Parser". You are really abusing the reason for having the formula interface. The whole point of it is to make it easy for user to specify a model, from which R generates the model matrix for you. Why use the formula at all if you have already produced your model matrix? See ?lm.fit for a fast way of fitting linear models (used within lm() ) if you have all the bits in place (i.e. you already have a model matrix) and are happy to take care of other details yourself. <snip /> > (Sorry, this is getting really long) --- So, my question/confusion comes > down to wanting to know why lm() doesn't check to see if an intercept has > been specified when the model has been specified using model.matrix. I guess because you can't programme around every possible usage that a user might dream up to try. If it did, lm would be an absolute pig and run like one. It is slow enough as it is with all the user-friendly stuff in there to help. (By slow enough, I mean it is perfectly speedy in routine use, but you wouldn't want to use it in a permutation test-like environment if you were after speed - you'd be better off working with lm.fit directly) The model.matrix bit is returning a matrix (without an entry for an intercept), which is fine on the rhs of a formula. As all you've done here is (effectively) create the following model: > form <- formula(paste("Gas ~", + paste(colnames(model.matrix (~ Insul/Temp-1, + data=whiteside)), collapse = " + "))) > form Gas ~ InsulBefore + InsulAfter + InsulBefore:Temp + InsulAfter:Temp I.e., that is what your convoluted formula is being interpreted as, you then still need to remove the intercept that R will automagically add to the model when it subsequently creates the model matrix. HTH G > > Regards, > Mark. > -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.