[R] Question about regression without an intercept

Ajay Shah Sat, 27 Dec 2008 02:11:30 -0800

Consider this code fragment:

  ---------------------------------------------------------------------------
  set.seed(42)


  x <- runif(20)
  y <- 2 + 3*x + rnorm(20)

  m1 <- lm(y ~ x)
  m2 <- lm(y ~ -1 + x)

  summary(m1)
  summary(m2)

  cor(y, fitted.values(m1))^2
  cor(y, fitted.values(m2))^2
  ---------------------------------------------------------------------------

m1 is the true model and all is well. m2 is a false model, the
intercept is truly 2 but it's been omitted.

The R2 for m1 shows as 0.4953 while for m2 it shows 0.8983.

I am aware that there are difficulties with standard formulas for R2
when there is no intercept. So the fact that the R2 of m2 is much
higher (even though it's a wrong model) probably flows from that.

What surprised me was that both correlations (between y and the
fitted values of either m1 or m2) are identical. I am unable to
understand how this could be. The estimated coefficient of x is quite
different between the two cases.

There must be an interesting theoretical angle to this. I would
greatly appreciate some help in understanding this, and (more
generally) in interpreting the R2 of regressions where the intercept
is absent.

-- 
Ajay Shah                                      http://www.mayin.org/ajayshah  
ajays...@mayin.org                             http://ajayshahblog.blogspot.com
<*(:-? - wizard who doesn't know the answer.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Question about regression without an intercept

Reply via email to