That the most common formula, but not the only one. See

  Kvålseth, T. (1985). Cautionary note about $R^2$. *American Statistician*,
*39*(4), 279–285.

Traditionally, the symbol 'R' is used for the Pearson correlation
coefficient and one way to calculate R^2 is... R^2.


On Sun, Mar 3, 2013 at 3:16 PM, Charles Determan Jr <>wrote:

> I was under the impression that in PLS analysis, R2 was calculated by 1-
> (Residual sum of squares) / (Sum of squares).  Is this still what you are
> referring to?  I am aware of the linear R2 which is how well two variables
> are correlated but the prior equation seems different to me.  Could you
> explain if this is the same concept?
> Charles
> On Sun, Mar 3, 2013 at 12:46 PM, Max Kuhn <> wrote:
>> > Is there some literature that you make that statement?
>> No, but there isn't literature on changing a lightbulb with a duck either.
>> > Are these papers incorrect in using these statistics?
>> Definitely, if they convert 3+ categories to integers (but there are
>> specialized R^2 metrics for binary classification models). Otherwise, they
>> are just using an ill-suited "score".
>>  How would you explain such an R^2 value to someone? R^2 is
>> a function of correlation between the two random variables. For two
>> classes, one of them is binary. What does it mean?
>> Historically, models rooted in computer science (eg neural networks) used
>> RMSE or SSE to fit models with binary outcomes and that *can* work work
>> well.
>> However, I don't think that communicating R^2 is effective. Other metrics
>> (e.g. accuracy, Kappa, area under the ROC curve, etc) are designed to
>> measure the ability of a model to classify and work well. With 3+
>> categories, I tend to use Kappa.
>> Max
>> On Sun, Mar 3, 2013 at 10:53 AM, Charles Determan Jr <>wrote:
>>> Thank you for your response Max.  Is there some literature that you make
>>> that statement?  I am confused as I have seen many publications that
>>> contain R^2 and Q^2 following PLSDA analysis.  The analysis usually is to
>>> discriminate groups (ie. classification).  Are these papers incorrect in
>>> using these statistics?
>>> Regards,
>>> Charles
>>> On Sat, Mar 2, 2013 at 10:39 PM, Max Kuhn <> wrote:
>>>> Charles,
>>>> You should not be treating the classes as numeric (is virginica really
>>>> three times setosa?). Q^2 and/or R^2 are not appropriate for 
>>>> classification.
>>>> Max
>>>> On Sat, Mar 2, 2013 at 5:21 PM, Charles Determan Jr 
>>>> <>wrote:
>>>>> I have discovered on of my errors.  The timematrix was unnecessary and
>>>>> an
>>>>> unfortunate habit I brought from another package.  The following
>>>>> provides
>>>>> the same R2 values as it should, however, I still don't know how to
>>>>> retrieve Q2 values.  Any insight would again be appreciated:
>>>>> library(caret)
>>>>> library(pls)
>>>>> data(iris)
>>>>> #needed to convert to numeric in order to do regression
>>>>> #I don't fully understand this but if I left as a factor I would get an
>>>>> error following the summary function
>>>>> iris$Species=as.numeric(iris$Species)
>>>>> inTrain1=createDataPartition(y=iris$Species,
>>>>>     p=.75,
>>>>>     list=FALSE)
>>>>> training1=iris[inTrain1,]
>>>>> testing1=iris[-inTrain1,]
>>>>> ctrl1=trainControl(method="cv",
>>>>>     number=10)
>>>>> plsFit2=train(Species~.,
>>>>>     data=training1,
>>>>>     method="pls",
>>>>>     trControl=ctrl1,
>>>>>     metric="Rsquared",
>>>>>     preProc=c("scale"))
>>>>> data(iris)
>>>>> training1=iris[inTrain1,]
>>>>> datvars=training1[,1:4]
>>>>> pls.dat=plsr(as.numeric(training1$Species),
>>>>>     ncomp=3, method="oscorespls", data=training1)
>>>>> x=crossval(pls.dat, segments=10)
>>>>> summary(x)
>>>>> summary(plsFit2)
>>>>> Regards,
>>>>> Charles
>>>>> On Sat, Mar 2, 2013 at 3:55 PM, Charles Determan Jr <
>>>>> >wrote:
>>>>> > Greetings,
>>>>> >
>>>>> > I have been exploring the use of the caret package to conduct some
>>>>> plsda
>>>>> > modeling.  Previously, I have come across methods that result in a
>>>>> R2 and
>>>>> > Q2 for the model.  Using the 'iris' data set, I wanted to see if I
>>>>> could
>>>>> > accomplish this with the caret package.  I use the following code:
>>>>> >
>>>>> > library(caret)
>>>>> > data(iris)
>>>>> >
>>>>> > #needed to convert to numeric in order to do regression
>>>>> > #I don't fully understand this but if I left as a factor I would get
>>>>> an
>>>>> > error following the summary function
>>>>> > iris$Species=as.numeric(iris$Species)
>>>>> > inTrain1=createDataPartition(y=iris$Species,
>>>>> >     p=.75,
>>>>> >     list=FALSE)
>>>>> >
>>>>> > training1=iris[inTrain1,]
>>>>> > testing1=iris[-inTrain1,]
>>>>> >
>>>>> > ctrl1=trainControl(method="cv",
>>>>> >     number=10)
>>>>> >
>>>>> > plsFit2=train(Species~.,
>>>>> >     data=training1,
>>>>> >     method="pls",
>>>>> >     trControl=ctrl1,
>>>>> >     metric="Rsquared",
>>>>> >     preProc=c("scale"))
>>>>> >
>>>>> > data(iris)
>>>>> > training1=iris[inTrain1,]
>>>>> > datvars=training1[,1:4]
>>>>> >
>>>>> >
>>>>> > n=nrow(
>>>>> > dat.indices=seq(1,n)
>>>>> >
>>>>> > timematrix=with(training1,
>>>>> >         classvec2classmat(Species[dat.indices]))
>>>>> >
>>>>> > pls.dat=plsr(timematrix ~,
>>>>> >     ncomp=3, method="oscorespls", data=training1)
>>>>> >
>>>>> > x=crossval(pls.dat, segments=10)
>>>>> >
>>>>> > summary(x)
>>>>> > summary(plsFit2)
>>>>> >
>>>>> > I see two different R2 values and I cannot figure out how to get the
>>>>> Q2
>>>>> > value.  Any insight as to what my errors may be would be appreciated.
>>>>> >
>>>>> > Regards,
>>>>> >
>>>>> > --
>>>>> > Charles
>>>>> >
>>>>> --
>>>>> Charles Determan
>>>>> Integrated Biosciences PhD Student
>>>>> University of Minnesota
>>>>>         [[alternative HTML version deleted]]
>>>>> ______________________________________________
>>>>> mailing list
>>>>> PLEASE do read the posting guide
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>> --
>>>> Max
>>> --
>>> Charles Determan
>>> Integrated Biosciences PhD Student
>>> University of Minnesota
>> --
>> Max
> --
> Charles Determan
> Integrated Biosciences PhD Student
> University of Minnesota



        [[alternative HTML version deleted]]

______________________________________________ mailing list
PLEASE do read the posting guide
and provide commented, minimal, self-contained, reproducible code.

Reply via email to