Hi, R actually uses a different formula for calculating the R square depending on whether the intercept is in the model or not.
You may also find this discussion helpful: http://stats.stackexchange.com/questions/7948/when-is-it-ok-to-remove-the-intercept-in-lm/ If you conceptualize R^2 as the squared correlation between the oberserved and fitted values, it is easy to get: summary(m0 <- lm(mpg ~ 0 + disp, data = mtcars)) summary(m1 <- lm(mpg ~ disp, data = mtcars)) cor(mtcars$mpg, fitted(m0))^2 cor(mtcars$mpg, fitted(m1))^2 but that is not how R calculates R^2. Cheers, Josh On Sat, Jul 28, 2012 at 10:40 AM, citynorman <citynor...@hotmail.com> wrote: > I've just picked up R (been using Matlab, Eviews etc) and I'm having the same > issue. Running reg=lm(ticker1~ticker2) gives R^2=50% while running > reg=lm(ticker1~0+ticker2) gives R^2=99%!! The charts suggest the fit is > worse not better and indeed Eviews/Excel/Matlab all say R^2=15% with > intercept=0. How come R calculates a totally different value?! > > Call: > lm(formula = ticker1 ~ ticker2) > > Residuals: > Min 1Q Median 3Q Max > -0.22441 -0.03380 0.01099 0.04891 0.16688 > > Coefficients: > Estimate Std. Error t value Pr(>|t|) > (Intercept) 1.57062 0.08187 19.18 <2e-16 *** > ticker2 0.61722 0.02699 22.87 <2e-16 *** > --- > Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 > > Residual standard error: 0.07754 on 530 degrees of freedom > Multiple R-squared: 0.4967, Adjusted R-squared: 0.4958 > F-statistic: 523.1 on 1 and 530 DF, p-value: < 2.2e-16 > > Call: > lm(formula = ticker1 ~ 0 + ticker2) > > Residuals: > Min 1Q Median 3Q Max > -0.270785 -0.069280 -0.007945 0.087340 0.268786 > > Coefficients: > Estimate Std. Error t value Pr(>|t|) > ticker2 1.134508 0.001441 787.2 <2e-16 *** > --- > Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 > > Residual standard error: 0.1008 on 531 degrees of freedom > Multiple R-squared: 0.9991, Adjusted R-squared: 0.9991 > F-statistic: 6.197e+05 on 1 and 531 DF, p-value: < 2.2e-16 > > > Jan private wrote >> >> Hi, >> >> thanks for your help. I'm beginning to understand things better. >> >>> If you plotted your data, you would realize that whether you fit the >>> 'best' least squares model or one with a zero intercept, the fit is >>> not going to be very good >>> Do the data cluster tightly around the dashed line? >> No, and that is why I asked the question. The plotted fit doesn't look >> any better with or without intercept, so I was surprised that the >> R-value etc. indicated an excellent regression (which I now understood >> is the wrong interpretation). >> >> One of the references you googled suggests that intercepts should never >> be omitted. Is this true even if I know that the physical reality behind >> the numbers suggests an intercept of zero? >> >> Thanks, >> Jan >> >> ______________________________________________ >> R-help@ mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > > -- > View this message in context: > http://r.789695.n4.nabble.com/lm-without-intercept-tp3312429p4638204.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology Programmer Analyst II, Statistical Consulting Group University of California, Los Angeles https://joshuawiley.com/ ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.