Hi Gavin, > ... In this instance, you don't want to be working with lm. You can use > lm.fit which returns an object with $coefficients, so I would guess you > need something like this (not tested) ... & more ...
That's a great help, many thank's for walking me through it; that's the stuff I need to get me through to the next level. Best Regards, Mark. Gavin Simpson wrote: > > > On Wed, 2007-12-12 at 06:46 -0800, Mark Difford wrote: >> Hi Gavin, Berwin, >> >> Thanks for your detailed replies. I'll make a general reply, if you >> don't >> mind. >> >> To reiterate, my main point is that if model.matrix() can be used in this >> way, then lm() shouldn't add an intercept. >> >> >> ... lm(Gas ~ model.matrix(~ Insul/Temp - 1), data = whiteside) .... >> >> And the documentation for lm() indicates. albeit indirectly, that >> model.matrix can be used in this way. It calls for a formula, or >> something >> that can be coerced to one. And the following meets that criterion: >> as.formula(Gas ~ model.matrix (~ Insul/Temp-1, data=whiteside)), and this >> specifies no intercept. > > This is the problem - it doesn't! > > What is says is model.matrix() don't add an intercept to the returned > model matrix. R then tries to work out what to do with the returned > matrix that is now on the rhs of the formula. But note that at no point > have you gotten round the fact that it will add an intercept *after* the > formula is parsed and a -1 is not found. This is because model.matrix is > called again, internally in lm but not with your formula as an argument > but with a terms object and nowhere in this object is it specified not > to include an intercept. > > To do this, lm would have to parse the formula for model.matrix(....) > and if it finds something work accordingly. > >> >> On the question of why I want to mess about in such a labarynthine way. >> Well my email was largely expository. With a straight call to lm(), I >> wouldn't bother with model.matrix. >> >> So, it really was about getting at lm coefficients inside a function >> (when >> you have to get the terms &c. from somewhere else), and trying to >> understand >> properly how things work, and why they work the way they do, and even if >> they should work the way they do. >> >> For instance:--- >> >> if (ols) { >> obj <- x[[1]] >> mt <- terms(obj) >> mf <- model.frame(obj) >> y <- model.response(mf) >> X <- model.matrix(mt, mf, contrasts = obj$contrasts) >> if (attr(mt, "intercept") == 1) ## This is my >> hack to overcome the double-intercept problem >> { olscf <- summary(lm(y ~ X))$coefficients } >> else { >> olscf <- summary(lm(y ~ X - 1))$coefficients >> } >> rownames(olscf) <- rownames(coef(obj)) > > In this instance, you don't want to be working with lm. You can use > lm.fit which returns an object with $coefficients, so I would guess you > need something like this (not tested) > > if (ols) { > obj <- x[[1]] > mt <- terms(obj) > mf <- model.frame(obj) > y <- model.response(mf) > X <- model.matrix(mt, mf, contrasts = obj$contrasts) > fit <- lm.fit(X, y) ## store fit in case you need something else > olscf <- coef(fit) ## get coefficients using an extractor function > ##if (attr(mt, "intercept") == 1) ## This is my hack to overcome > the double-intercept problem > ## { olscf <- summary(lm(y ~ X))$coefficients } > ##else { > ## olscf <- summary(lm(y ~ X - 1))$coefficients > ##} > ##rownames(olscf) <- rownames(coef(obj)) > } > > Note also, that even if you are using your hack, you don't need to do > > summary(lm(y ~ X))$coefficients > > as > > coef(lm(y ~ X)) > > will do. > > lm is mainly there for top-level work. If you need to do things like you > have above, use lm.fit or do the qr decomposition yourself. > > Or try a different way to build the formula you need and work with lm; > Bill Venables has a nice piece in R News Vol 2 Issue 2 in the > Programmer's Niche section on something that might be of use if you want > to build a correct formula for your needs from the colnames of X and y? > > All the best, > > G > >> >> Thanks again for your input. >> >> Regards, >> Mark. >> >> >> >> >> >> Berwin A Turlach wrote: >> > >> > G'day Mark, >> > >> > On Wed, 12 Dec 2007 02:05:54 -0800 (PST) >> > Mark Difford <[EMAIL PROTECTED]> wrote: >> > >> >> In order to get the same coefficients as we get from the following >> > [...] >> >> we need to do the following (if we use model.matrix to specify the >> >> model) >> > >> > By why would you want to do this? >> > >> >> ## >> >> summary ( lm(Gas ~ model.matrix(~ Insul/Temp - 1) - 1, data = >> >> whiteside) ) >> >> >> >> That is, we need to take out "two intercepts." Is this "correct"? >> > >> > Yes. >> > >> >> Shouldn't lm check to see if an intercept has been requested as part >> >> of the model formula? >> > >> > No, it does not. In the Details section of lm's help page you will >> > find the following: >> > >> > A formula has an implied intercept term. To remove this use >> > either 'y ~ x - 1' or 'y ~ 0 + x'. See 'formula' for more details >> > of allowed formulae. >> > >> > Thus, except if you explicitly ask for a constant term not be included, >> > lm will add a constant term (a column of ones) additionally to what >> > ever you have specified on the right hand side of the formula. >> > >> >> If I do >> >> ## >> >> summary(lm(as.formula(Gas ~ model.matrix (~ Insul/Temp-1, >> >> data=whiteside)), data=whiteside)) >> >> >> >> we get a strange model. >> > >> > Well, you get a model in which not all parameters are identifiable, and >> > a particular parameter that is not identifiable is estimated by NA. I >> > am not sure what is strange about this. >> > >> >> But the formula part of this model qualifies >> >> as a valid formula >> >> ## >> >> as.formula(Gas ~ model.matrix (~ Insul/Temp-1, data=whiteside)) >> > >> > Debatable, the above command only shows that it can be coerced into a >> > valid formula. :) >> > >> >> just as if I were to write: lm(Gas ~ Insul/Temp - 1, data=whiteside) >> >> >> >> But we know that the _correct_ formula is the following >> > >> >> ## >> >> as.formula(Gas ~ model.matrix (~ Insul/Temp-1, data=whiteside) -1) >> > >> > Why is this formula any more correct than the other one? Both specify >> > exactly the same model. It is just that one does it in an >> > overparameterised way. >> > >> >> (Sorry, this is getting really long) --- So, my question/confusion >> >> comes down to wanting to know why lm() doesn't check to see if an >> >> intercept has been specified when the model has been specified using >> >> model.matrix. >> > >> > Because lm() is documented not to check this. If you do not want to >> > have an intercept in the model you have to specifically ask it for. >> > >> > Also, comparing the output of >> > summary( lm(Gas ~ Insul/Temp - 1, data = whiteside) ) >> > and >> > summary( lm(Gas ~ Insul/Temp, data = whiteside ) ) >> > >> > you can see that lm() does not check whether there is an implicit >> > intercept in the model. Compare the (Adjusted) R-squared values >> > returned; one case is using the formula for models with no intercept >> > the other one the formula for models with intercept. Similar story >> > with the reported F-statistics. >> > >> > Cheers, >> > >> > Berwin >> > >> > =========================== Full address ============================= >> > Berwin A Turlach Tel.: +65 6515 4416 (secr) >> > Dept of Statistics and Applied Probability +65 6515 6650 (self) >> > Faculty of Science FAX : +65 6872 3919 >> > National University of Singapore >> > 6 Science Drive 2, Blk S16, Level 7 e-mail: [EMAIL PROTECTED] >> > Singapore 117546 http://www.stat.nus.edu.sg/~statba >> > >> > ______________________________________________ >> > R-help@r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> > http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> > >> > >> > -- > %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% > Dr. Gavin Simpson [t] +44 (0)20 7679 0522 > ECRC, UCL Geography, [f] +44 (0)20 7679 0565 > Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk > Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ > UK. WC1E 6BT. [w] http://www.freshwaters.org.uk > %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > -- View this message in context: http://www.nabble.com/lm-model.matrix-confusion-%28--bug%29-tp14292188p14300313.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.