See also this thread in stats.stackexchange https://stats.stackexchange.com/questions/26176/removal-of-statistically-significant-intercept-term-increases-r2-in-linear-mo
On Thu, Sep 27, 2018 at 3:43 PM, J C Nash <profjcn...@gmail.com> wrote: > This issue that traces back to the very unfortunate use > of R-squared as the name of a tool to simply compare a model to the model > that > is a single number (the mean). The mean can be shown to be the optimal > choice > for a model that is a single number, so it makes sense to try to do better. > > The OP has the correct form -- and I find no matter what the software, when > working with models that do NOT have a constant in them (i.e., nonlinear > models, regression through the origin) it pays to do the calculation > "manually". In R it is really easy to write the necessary function, so > why take a chance that a software developer has tried to expand the concept > using a personal choice that is beyond a clear definition. > > I've commented elsewhere that I use this statistic even for nonlinear > models in my own software, since I think one should do better than the > mean for a model, but other workers shy away from using it for nonlinear > models because there may be false interpretation based on its use for > linear models. > > JN > > > On 2018-09-27 06:56 AM, Patrick Barrie wrote: > > I have a query on the R-squared correlation coefficient for linear > > regression through the origin. > > > > The general expression for R-squared in regression (whether linear or > > non-linear) is > > R-squared = 1 - sum(y-ypredicted)^2 / sum(y-ybar)^2 > > > > However, the lm function within R does not seem to use this expression > > when the intercept is constrained to be zero. It gives results different > > to Excel and other data analysis packages. > > > > As an example (using built-in cars dataframe): > >> cars.lm=lm(dist ~ 0+speed, data=cars) # linear regression through > > origin > >> summary(cars.lm)$r.squared # report R-squared [1] 0.8962893 > > > 1-deviance(cars.lm)/sum((cars$dist-mean(cars$dist))^2) # calculates > > R-squared directly [1] 0.6018997 > # The latter corresponds to the value > > reported by Excel (and other data analysis packages) > > # Note that we > > expect R-squared to be smaller for linear regression through the origin > > > # than for linear regression without a constraint (which is 0.6511 in > > this example) > > > > Does anyone know what R is doing in this case? Is there an option to get > > R to return what I termed the "general" expression for R-squared? The > > adjusted R-squared value is also affected. [Other parameters all seem > > correct.] > > > > Thanks for any help on this issue, > > > > Patrick > > > > P.S. I believe old versions of Excel (before 2003) also had this issue. > > > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.