Hi:
On Fri, Feb 18, 2011 at 2:49 AM, Jan <jrheinlaen...@gmx.de> wrote: > Hi, > > I am not a statistics expert, so I have this question. A linear model > gives me the following summary: > > Call: > lm(formula = N ~ N_alt) > > Residuals: > Min 1Q Median 3Q Max > -110.30 -35.80 -22.77 38.07 122.76 > > Coefficients: > Estimate Std. Error t value Pr(>|t|) > (Intercept) 13.5177 229.0764 0.059 0.9535 > N_alt 0.2832 0.1501 1.886 0.0739 . > --- > Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 > > Residual standard error: 56.77 on 20 degrees of freedom > (16 observations deleted due to missingness) > Multiple R-squared: 0.151, Adjusted R-squared: 0.1086 > F-statistic: 3.558 on 1 and 20 DF, p-value: 0.07386 > > The regression is not very good (high p-value, low R-squared). > The Pr value for the intercept seems to indicate that it is zero with a > very high probability (95.35%). So I repeat the regression forcing the > intercept to zero: > That's not the interpretation of a p-value. What it means is: *given that the null hypothesis beta0 = 0 is true*, the probability of observing a value of the t-statistic *more extreme than the observed value of 0.059* is about 0.9535. The presumption that H_0 is true for the purpose of the test allows one to derive a 'reference distribution' (in this case, the t-distribution with error degrees of freedom) against which one can compare the observed value of the t-statistic. The second part of the emphasized statement provides a context for which the p-value can be correctly interpreted in relation to the reference distribution of the test statistic when H_0 is true. You're evidently trying to interpret the p-value as the probability that the null hypothesis is true. No. You can conclude, however, that there is not enough sample evidence to contradict the null hypothesis beta0 = 0 due to the magnitude of the p-value. > Call: > lm(formula = N ~ N_alt - 1) > > Residuals: > Min 1Q Median 3Q Max > -110.11 -36.35 -22.13 38.59 123.23 > > Coefficients: > Estimate Std. Error t value Pr(>|t|) > N_alt 0.292046 0.007742 37.72 <2e-16 *** > --- > Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 > > Residual standard error: 55.41 on 21 degrees of freedom > (16 observations deleted due to missingness) > Multiple R-squared: 0.9855, Adjusted R-squared: 0.9848 > F-statistic: 1423 on 1 and 21 DF, p-value: < 2.2e-16 > > 1. Is my interpretation correct? > 2. Is it possible that just by forcing the intercept to become zero, a > bad regression becomes an extremely good one? > No. > 3. Why doesn't lm suggest a value of zero (or near zero) by itself if > the regression is so much better with it? > Because computer programs don't read minds. You may want a zero intercept; someone else may not. And your perception that the 'regression is so much better' with a zero intercept is in error. If you plotted your data, you would realize that whether you fit the 'best' least squares model or one with a zero intercept, the fit is not going to be very good, and you would have deduced that the 0.985 R^2 returned from the no-intercept model is an illusion. It is mathematically correct, however, given the linear model theory behind it and the definition of R^2 as the ratio of the model sum of squares (SS) to the total SS. If you want to have more fun, sum the residuals from the zero-intercept fit, and then ask yourself why they don't add to zero. You need to educate yourself on the difference between regression with and without intercepts. In particular, the R^2 in the with-intercept model uses mean corrections before computing sums of squares; in the no-intercept model, mean corrections are not applied. Since R^2 is a ratio of sums of squares, this distinction matters. (If my use of 'mean correction' is confusing, Y is not mean-corrected, but Y - Ybar is. Ditto for X.) Try this: plot(N_alt, N, pch = 16) abline(coef(lm(N ~ N_alt))) abline(c(0, coef(lm(N ~ N_alt + 0))), lty = 'dashed') Do the data cluster tightly around the dashed line? HTH, Dennis PS: A Google search on 'linear regression zero intercept' might be beneficial. Here are a couple of hits from such a search: http://www.bios.unc.edu/~truong/b663/pdf/noint.pdf http://tltc.ttu.edu/cs/colleges__schools/rawls_college_of_business/f/42/p/288/470.aspx Please excuse my ignorance. > > Jan Rheinländer > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.