Thank you for the detailed answer, that was really helpful.
I did some excessive reading and calculating in the last hours since your
reply, and have a few (hopefully much more informed) follow up questions.


1) In the vignette("countreg", package = "pscl"), LLH, AIC and BIC values
are listed for the models Negative-binomial (NB), Zero-Inflated (ZI), ZI
NB, Hurdle NB, and Poisson (Standard). And although I found a way to
determine LLH and DF for all models, BIC & AIC values are not displayed by
default, neither using the code given in the vignette. How do I calculate
these? (AIC is given as per default only in 2 models, BIC in none).


2) For the zero-inflated models, the first block of count model
coefficients is only in the output in order to compare the changes,
correct? That is, in my results in a paper I would only report the second
block of (zero-inflation) model coefficients? Or do I misunderstand
something? I am confused because in their large table to compare
coefficients, they primarily compare the first block of coefficients (p. 18)
R> fm <- list("ML-Pois" = fm_pois, "Quasi-Pois" = fm_qpois, "NB" = fm_nbin,
+ "Hurdle-NB" = fm_hurdle, "ZINB" = fm_zinb)
R> sapply(fm, function(x) coef(x)[1:8])


3)

> There are various formal tests for this, e.g., dispersiontest() in package
> "AER".
>

I have to run 9 models - I am testing the influence of several predictors
on different individual symptoms of a mental disorder, as "counted" in the
last week (0=never in the last week, to 3 = on all day within the last
week).
So I'm regressing the same predictors onto 9 different symptoms in 9
models.

Dispersiontest() for these 9 models result in 3-4 overdispersed models
(depending if testing one- or two-sided on p=.05 level), 2 underdispersed
models, and 4 non-significant models. The by far largest dispersion value
is 2.1 in a model is not overdispersed according to the test, but that's
the symptom with 80% zeros, maybe that plays a role here.

Does BN also make sense in underdispersed models?

However, overdispersion can already matter before this is detected by a
> significance test. Hence, if in doubt, I would simply use an NB model and
> you're on the safe side. And if the NB's estimated theta parameter turns
> out to be extremely large (say beyond 20 or 30), then you can still switch
> back to Poisson if you want.


Out of the 9 models, the 3 overdispersed NB models "worked" with theta
estimates between 0.4 and 8.6. The results look just fine, so I guess NB is
appropriate here.

4 other models (including the 2 underdispersed ones) gave warnings (theta
iteration / alternation limit reached), and although the other values look
ok (estimates, LLH, AIC), theta estimates are between 1.000 and 11.000.
Could you recommend me what to do here?

Thanks
T

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to