As Kehl pointed out, any linear function of the independent variable (speed) 
will have the same squared correlation with the dependent variable (dist), but 
only one linear function minimizes the squared deviations between the fitted 
values and the original values. The equation you are using is only applicable 
to that function, not to any of the others. In fact, some linear functions will 
produce negative values:

> fitted.new <- 6*cars$speed
> cor(cbind(fitted.new, fitted.right, fitted.wrong, cars$dist))
             fitted.new fitted.right fitted.wrong          
fitted.new    1.0000000    1.0000000    1.0000000 0.8068949
fitted.right  1.0000000    1.0000000    1.0000000 0.8068949
fitted.wrong  1.0000000    1.0000000    1.0000000 0.8068949
              0.8068949    0.8068949    0.8068949 1.0000000
> 1-sum((cars$dist-fitted.new)^2)/sum((cars$dist-mean(cars$dist))^2)
[1] -3.281849

David L. Carlson
Department of Anthropology
Texas A&M University

-----Original Message-----
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Jonathan Thayn
Sent: Sunday, February 22, 2015 12:01 AM
To: Kehl Dániel
Cc: r-help@r-project.org
Subject: Re: [R] Correlation question

Of course! Thank you, I knew I was missing something painfully obvious. Its 
seems, then, that this line

1-sum((cars$dist-fitted.wrong)^2)/sum((cars$dist-mean(cars$dist))^2)

is finding something other than the traditional correlation. I found this in a 
lecture introducing correlation, but , now, I'm not sure what it is. It does do 
a better job of showing that the fitted.wrong variable is not a good prediction 
of the distance. 



On Feb 21, 2015, at 4:36 PM, Kehl Dániel wrote:

> Hi,
> 
> try
> 
> cor(fitted.right,fitted.wrong)
> 
> should give 1 as both are a linear function of speed! Hence 
> cor(cars$dist,fitted.right)^2 and cor(x=cars$dist,y=fitted.wrong)^2 must be 
> the same.
> 
> HTH
> d
> ________________________________________
> Feladó: R-help [r-help-boun...@r-project.org] ; meghatalmaz&#243;: Jonathan 
> Thayn [jth...@ilstu.edu]
> Küldve: 2015. február 21. 22:42
> To: r-help@r-project.org
> Tárgy: [R] Correlation question
> 
> I recently compared two different approaches to calculating the correlation 
> of two variables, and I cannot explain the different results:
> 
> data(cars)
> model <- lm(dist~speed,data=cars)
> coef(model)
> fitted.right <- model$fitted
> fitted.wrong <- -17+5*cars$speed
> 
> 
> When using the OLS fitted values, the lines below all return the same R2 
> value:
> 
> 1-sum((cars$dist-fitted.right)^2)/sum((cars$dist-mean(cars$dist))^2)
> cor(cars$dist,fitted.right)^2
> (sum((cars$dist-mean(cars$dist))*(fitted.right-mean(fitted.right)))/(49*sd(cars$dist)*sd(fitted.right)))^2
> 
> 
> However, when I use my estimated parameters to find the fitted values, 
> "fitted.wrong", the first equation returns a much lower R2 value, which I 
> would expect since the fit is worse, but the other lines return the same R2 
> that I get when using the OLS fitted values.
> 
> 1-sum((cars$dist-fitted.wrong)^2)/sum((cars$dist-mean(cars$dist))^2)
> cor(x=cars$dist,y=fitted.wrong)^2
> (sum((cars$dist-mean(cars$dist))*(fitted.wrong-mean(fitted.wrong)))/(49*sd(cars$dist)*sd(fitted.wrong)))^2
> 
> 
> I'm sure I'm missing something simple, but can someone explain the difference 
> between these two methods of finding R2? Thanks.
> 
> Jon
>        [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to