Berwin A Turlach, Donnerstag, 18. Oktober 2007: > G'day all, > > I must admit that I have not read the previous e-mails in this thread, > but why should that stop me to comment? ;-)
Your comments are very welcome. > On Thu, 18 Oct 2007 16:17:38 +0200 > Ralf Goertz <[EMAIL PROTECTED]> wrote: > > > But in that case the numerator is very large, too, isn't it? > > Not necessarily. > > > I don't want to argue, though. > > Good, you might lose the argument. :) Yes, I admit I lost. :-( > > But so far, I have not managed to create a dataset where R^2 is > > larger for the model with forced zero intercept (although I have not > > tried very hard). It would be very convincing to see one (Etienne?) > > Indeed, you haven't tried hard. It is not difficult. Here are my > canonical commands to convince people why regression through the > intercept is evil; the pictures should illustrate what is going on: > [example snipped] Thanks to Thomas Lumley there is another convincing example. But still I've got a problem with it: > x<-c(2,3,4);y<-c(2,3,3) > 1-2*var(residuals(lm(y~x+1)))/sum((y-mean(y))^2) [1] 0.75 That's okay, but neither > 1-3*var(residuals(lm(y~x+0)))/sum((y-0)^2) [1] 0.97076 nor > 1-2*var(residuals(lm(y~x+0)))/sum((y-0)^2) [1] 0.9805066 give the result of summary(lm(y~x+0)), which is 0.9796. > > IIRC, I have not been told so. Perhaps my teachers were not as good > > they should have been. So what is R^2 good if not to indicate the > > goodness of fit?. > > I am wondering about that too sometimes. :) I was always wondering > that R^2 was described to me by my lecturers as the square of the > correlation between the x and the y variate. But on the other hand, > they pretended that x was fixed and selected by the experimenter (or > should be regarded as such). If x is fixed and y is random, then it > does not make sense to me to speak about a correlation between x and y > (at least not on the population level). I see the point. But I was raised with that description, too, and it's hard to drop that idea. > My best guess at the moment is that R^2 was adopted by users of > statistics before it was properly understood; and by the time it was > properly understood, it was too much entrenched to abandon it. Try not > to teach it these days and see what your "client faculties" will tell > you. In order to save the role of R^2 as a goodness-of-fit indicator in zero intercept models one could use the same formula like in models with a constant. I mean, if R^2 is the proportion of variance explained by the model we should use the a priori variance of y[i]. > 1-var(residuals(lm(y~x+0)))/var(y) [1] 0.3567182 But I assume that this has probably been discussed at length somewhere more appropriate than r-help. Thanks, Ralf ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.