Does anyone know how to get the C-index from a logistic model - not using
the dataset that was used to train the model, but instead using a fresh
dataset on the same model?

I have a dataset of 400 points that I've split into two halves, one for
training the logistic model, and the other for evaluating it. The structure
is as follows:

column headers are "got a loan" (dichotomous), "hourly income" (continuous),
and "owns own home" (dichotomous)
The training data is
*trainingData[1,] = c(0,12,0)*
*...*
etc

and the validation data is
*validationData[1,] = c(1,35,1)*
*...*
etc

I use Prof. Harrell's excellent Design modules to perform a logistic
regression on the training data like so:
*logit.lrm <- lrm(gotALoan ~ hourlyIncome+ownsHome, data=trainingData)*
*lrm(formula = logit.lrm)$stats[6]*
(output is C 0.8739827 - i.e., just the C-index)
**
I really like the ability to extract the C-index (or ROC AUC), because this
is a factor that I find very helpful in comparing various models. However, I
don't really want to get that from the data that the model was built on.
Using that C-statistic would be cheating, in a sense, since I'm just testing
the model on the data it was built against. I would rather get the
C-statistic by applying the model I just generated to the other half of the
data that I saved.

I have tried doing this:
*lrm(formula = logit.lrm,data=validationData)*
However, this actually generates a new model (giving different coefficients
to the variables). It doesn't simply apply the new data to the model from *
logit.lrm* that I generated before.

So, can someone point me in the right direction for evaluating the model
that I built with trainingData, but getting the C-statistic against my
validationData?

Thanks so much,

Kyle Werner

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to