Berwin A Turlach, Freitag, 19. Oktober 2007: > G'day Ralf, Hi Berwin,
> On Fri, 19 Oct 2007 09:51:37 +0200 Ralf Goertz <[EMAIL PROTECTED]> > wrote: > > Why should either of those formula yield the output of > summary(lm(y~x+0)) ? The R-squared output of that command is > documented in help(summary.lm): > > r.squared: R^2, the 'fraction of variance explained by the model', > > R^2 = 1 - Sum(R[i]^2) / Sum((y[i]- y*)^2), yes I know. But you know why I chose those formulas, right? > where y* is the mean of y[i] if there is an intercept and > zero otherwise. > > And, indeed: > > > 1-sum(residuals(lm(y~x+0))^2)/sum((y-0)^2) > [1] 0.9796238 > > confirms this. > > Note: if you do not have an intercept in your model, the residuals do > not have to add to zero; and, typically, they will not. Hence, > var(residuals(lm(y~x+0)) does not give you the residual sum of squares. Yes I am right, you know why. > > In order to save the role of R^2 as a goodness-of-fit indicator > > R^2 is no goodness-of-fit indicator, neither in models with intercept > nor in models without intercept. So I do not see how you can save its > role as a goodness-of-fit indicator. :) Okay, I surrender. > Since you are posting from a .de domain, I assume you will understand > the following quote from Tutz (2000), "Die Analyse kategorialer Daten", > page 18: > > R^2 misst *nicht* die Anpassungsguete des linearen Modelles, es sagt > nichts darueber aus, ob der lineare Ansatz wahr oder falsch ist, sondern > nur ob durch den linearen Ansatz individuelle Beobachtungen > vorhersagbar sind. R^2 wird wesentlich vom Design, d.h. den Werten, > die x annimmt bestimmt (vgl. Kockelkorn (1998)). Danke schön. > > But I assume that this has probably been discussed at length > > somewhere more appropriate than r-help. > > I am sure about that, but it was also discussed here on r-help (long > ago). The problem is that this compares two models that are not nested > in each other which is a quite controversial thing to do; some might > even go so far as saying that it makes no sense at all. The other > problem with this approaches is illustrated by my example: > > > set.seed(20070807) > > x <- runif(100)*2+10 > > y <- 4+rnorm(x, sd=1) > > 1-var(residuals(lm(y~x+0)))/var(y) > [1] -0.04848273 > > How do you explain that a quantity that is called R-squared, implying > that it is the square of something, hence always non-negative, can > become negative? because the correlation coefficient is either 0.2201879424i or -0.2201879424i ;) Thanks for your time, and yours as well, Steve. You've been very helpful. Ralf ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.