I am attempting to run a glm with a binomial model to analyze proportion
data.
I have been following Crawley's book closely and am wondering if there is
an accepted standard for how much is too much overdispersion? (e.g. change
in AIC has an accepted standard of 2).

In the example, he fits several models, binomial and quasibinomial and then
accepts the quasibinomial.
The output for residual deviance and df does not change between these two
models, however, he accepts the quasibinomial.
Is there a different calculation that I missed that actually provides an
overdispersion factor (as in SAS?)

My code and output are below, given the example in the book, these data are
WAY overdispersed .....do I mention this and go on or does this signal the
need to try a different model? If so, any suggestions on the type of
distribution (gamma or negative binomial ?)?

attach(Clutch2)
 y<-cbind(Total,Size-Total)
glm1<-glm(y~Pred,"binomial")
summary(glm1)

Call:
glm(formula = y ~ Pred, family = "binomial")

Deviance Residuals:
    Min       1Q   Median       3Q      Max
-9.1022  -2.7899  -0.4781   2.6058   8.4852

Coefficients:
            Estimate Std. Error z value Pr(>|z|)
(Intercept)  1.35095    0.06612  20.433  < 2e-16 ***
PredF       -0.34811    0.11719  -2.970  0.00297 **
PredSN      -3.29156    0.10691 -30.788  < 2e-16 ***
PredW       -1.46451    0.12787 -11.453  < 2e-16 ***
PredWF      -0.56412    0.13178  -4.281 1.86e-05 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 2815.5  on 108  degrees of freedom
Residual deviance: 1323.5  on 104  degrees of freedom
  (3 observations deleted due to missingness)
AIC: 1585.2

Number of Fisher Scoring iterations: 5


#### the output for residual deviance and df does not change even when I
use quasibinomial, is this ok?  #####


 library(MASS)
> glm2<-glm(y~Pred,"quasibinomial")
> summary(glm2)

Call:
glm(formula = y ~ Pred, family = "quasibinomial")

Deviance Residuals:
    Min       1Q   Median       3Q      Max
-9.1022  -2.7899  -0.4781   2.6058   8.4852

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)   1.3510     0.2398   5.633 1.52e-07 ***
PredF        -0.3481     0.4251  -0.819  0.41471
PredSN       -3.2916     0.3878  -8.488 1.56e-13 ***
PredW        -1.4645     0.4638  -3.157  0.00208 **
PredWF       -0.5641     0.4780  -1.180  0.24063
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

(Dispersion parameter for quasibinomial family taken to be 13.15786)

    Null deviance: 2815.5  on 108  degrees of freedom
Residual deviance: 1323.5  on 104  degrees of freedom
  (3 observations deleted due to missingness)
AIC: NA

Number of Fisher Scoring iterations: 5

> anova(glm2,test="F")
Analysis of Deviance Table

Model: quasibinomial, link: logit

Response: y

Terms added sequentially (first to last)


      Df Deviance Resid. Df Resid. Dev      F   Pr(>F)
NULL                    108     2815.5
Pred   4   1492.0       104     1323.5 28.349 6.28e-16 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
> model1<-update(glm2,~.-Pred)
> anova(glm2,model1,test="F")
Analysis of Deviance Table

Model 1: y ~ Pred
Model 2: y ~ 1
  Resid. Df Resid. Dev  Df Deviance      F   Pr(>F)
1       104     1323.5
2       108     2815.5  -4  -1492.0 28.349 6.28e-16 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
> coef(glm2)
(Intercept)       PredF      PredSN       PredW      PredWF
  1.3509550  -0.3481096  -3.2915601  -1.4645097  -0.5641223


Thanks
Jessica
hit...@vcu.edu

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to