Re: [R] How to assess the accuracy of fitted logistic regression using glm

Uwe Ligges Fri, 10 Jun 2011 05:49:21 -0700


On 10.06.2011 08:54, Xiaobo Gu wrote:

Hi Professor Brian,

Thanks for your reply.

I think there are many statisticians here, and it is somehow R
related, hoping someone can
help me.

I have done a simple test, using a sample csv data which I post if need.

donut<- read.csv(file="D:/donut.csv", header = TRUE);
donut[["color"]]<- as.factor(donut[["color"]])
donut[["shape"]]<- as.factor(donut[["shape"]])
donut[["k"]]<- as.factor(donut[["k"]])
donut[["k0"]]<- as.factor(donut[["k0"]])
donut[["bias"]]<- as.factor(donut[["bias"]])

lr<- glm(color ~ shape + x + y, family = binomial, data = donut);
summary(lr)

Call:
glm(formula = color ~ shape + x + y, family = binomial, data = donut)

Deviance Residuals:
     Min       1Q   Median       3Q      Max
-2.1079  -0.9476   0.5086   0.7518   1.4079

Coefficients:
             Estimate Std. Error z value Pr(>|z|)
(Intercept)  2.53010    1.65500   1.529   0.1263
shape22      0.05628    1.54990   0.036   0.9710
shape23     -0.74568    1.44813  -0.515   0.6066
shape24     -2.61896    1.38016  -1.898   0.0578 .
shape25     -2.07648    1.32818  -1.563   0.1180
x           -0.45885    1.52863  -0.300   0.7640
y           -0.59311    1.46999  -0.403   0.6866
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

     Null deviance: 50.446  on 39  degrees of freedom
Residual deviance: 42.473  on 33  degrees of freedom
AIC: 56.473

Number of Fisher Scoring iterations: 4

In the Coefficients section, is Pr(>|z|) the P-value for that
variable, and there
are a few other questions:
1. How to determine the predict power of each variables?
2. How to determine the overall performance of the fitted model, here what's the
difference between and "Deviance Residuals" and "Residual deviance"?
3. How to compare "Null deviance" and "Residual deviance"?
4. What does AIC mean, and how to use this measure?
5. What does the Signif. codes section mean?

To answer your question, we'd need to write half a book, at least. Thiscannot be answered in an e-mail message. Hence please re-read BrianRipley's advice and try to get statistical advice from a localconsultant or read elementary textbooks on the subject.


Uwe Ligges

Regards,

Xiaobo Gu



On Mon, Jun 6, 2011 at 9:59 PM, Prof Brian Ripley<rip...@stats.ox.ac.uk>  wrote:

On Mon, 6 Jun 2011, Xiaobo Gu wrote:

Hi,

I am trying glm with family = binomial to do binary logistic
regression, but how can I assess the accuracy of the fitted model, the
summary method can print a lot of information about the returned
object, such as coefficients, because statistics is not my speciality,
so can you share some rule of thumb to exam the  fitted model from the
practical perspective.


It depends entirely on why you did the fit.  People have written whole books
on assessing the performance of classification procedures such as binary
logistic regression.  For example, the residual deviance is closely related
to log-probability scoring: for some purposes that is a good performance
measure, for others (e.g. when you are going to threshold the predicted
probabilities) it can be very misleading.

In short, you need statistical advice, not R advice (the purpose of this
list).


Regards,

Xiaobo Gu

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Brian D. Ripley,                  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to assess the accuracy of fitted logistic regression using glm

Reply via email to