Re: [R] Poisson Regression: questions about tests of assumptions

Achim Zeileis Sun, 14 Oct 2012 09:16:03 -0700

On Sun, 14 Oct 2012, Eiko Fried wrote:

I would like to test in R what regression fits my data best. My dependent
variable is a count, and has a lot of zeros.


And I would need some help to determine what model and family to use
(poisson or quasipoisson, or zero-inflated poisson regression), and how to
test the assumptions.

1) Poisson Regression: as far as I understand, the strong assumption is
that dependent variable mean = variance. How do you test this? How close
together do they have to be? Are unconditional or conditional mean and
variance used for this? What do I do if this assumption does not hold?

There are various formal tests for this, e.g., dispersiontest() in package"AER". Alternatively, you can use a simple likelihood-ratio test (e.g., bymeans of lrtest() in "lmtest") between a poisson and negative binomial(NB) fit. The p-value can even be halved because the Poisson is on theborder of the NB theta parameter range (theta = infty).

However, overdispersion can already matter before this is detected by asignificance test. Hence, if in doubt, I would simply use an NB model andyou're on the safe side. And if the NB's estimated theta parameter turnsout to be extremely large (say beyond 20 or 30), then you can still switchback to Poisson if you want.

2) I read that if variance is greater than mean we have overdispersion,and a potential way to deal with this is including more independentvariables, or family=quasipoisson. Does this distribution have any otherrequirements or assumptions? What test do I use to see whether 1) or 2)fits better - simply anova(m1,m2)?

quasipoisson yields the same parameter estimates as the poisson, only theinference is adjusted appropriately.

3) I also read that negative-binomial distribution can be used whenoverdispersion appears. How do I do this in R?


glm.nb() in "MASS" is one of standard options.

What is the difference to quasipoisson?

The NB is a likelihood-based model while the quasipoisson is notassociated with a likelihood (but has the same conditional mean equation).

4) Zero-inflated Poisson Regression: I read that using the vuong test
checks what models fits better.

vuong (model.poisson, model.zero.poisson)

Is that correct?


It's one of the possibilities.

5) ats.ucla.edu has a section about zero-inflated Poisson Regressions, and
test the zeroinflated model (a) against the standard poisson model (b):

m.a <- zeroinfl(count ~ child + camper | persons, data = zinb)
m.b <- glm(count ~ child + camper, family = poisson, data = zinb)
vuong(m.a, m.b)

I don't understand what the "| persons" part of the first model does, and
why you can compare these models if. I had expected the regression to be
the same and just use a different family.

I recommend you read the associated documentation. Seevignette("countreg", package = "pscl")

For glm.nb() I recommend its accompanying documentation, namely the MASSbook.


hth,
Z

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Poisson Regression: questions about tests of assumptions

Reply via email to