Re: [R] Underdispersion and count data

Achim Zeileis Thu, 07 Nov 2013 22:58:42 -0800

On Thu, 7 Nov 2013, sv.j...@yahoo.ca wrote:

Hello,
I have count data for 4 groups, 2 of which have a large number of zeroesand are overdispersed, and the other 2 underdispersed with no zeroes.

Are you sure that it's really underdispersion in addition to the lack ofzeros? It could also be that due to the missing zeros, there is lessdispersion.

I have two questions about model fitting, which I am quite new to, andhave been using mostly the pscl package.
1 - How do I deal with underdispersion? Almost all the published andonline advice is regarding overdispersion, and neither the Poisson nornegative binomial distribution seem appropriate. The COM Poisson comesup sometimes as a suggestion, but it's not clear to me how I can usethis, explain my choice of it, or what information I would report forpublication purposes.

There are (at least) two packages on CRAN: compoisson and ComPoissonRegwhich support this.

However, I would check first whether this is really needed or maybe azero-truncated Poisson model is already sufficient.

The package "countreg" on R-Forge(https://R-Forge.R-project.org/R/?group_id=522) has a function zerotrunc()which is essentially the same code that hurdle() in "pscl" uses. So itshould be easy to use for you.

2 - For the overdispersed data with lots of zeroes, I've triedzero-inflated Poisson and NegBin and hurdle models, and used the Vuongtest to compare. However, I get equal fit for two candidate models thatproduce quite different coefficient estimates for my predictorvariables, and hence different p values. I am unsure how to proceed inchoosing one of these models, and how I would justify one over the othergiven that the Vuong test seems not to discriminate.

Is it just zero-inflated vs. hurdle or also differences in the regressors?If the former: zero-inflated and hurdle models are parametrizeddifferently but often lead to similar fits. But the former has a countpart plus a zero-inlation part whereas the latter as a zero-truncatedcount part and a zero hurdle.

If the regressors are different, then it's probably a subject-matterdecision.


hth,
Z

Thank you and any advice would be much appreciated.

Mo

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Underdispersion and count data

Reply via email to