Hi all,
I have a data set made of 12 years each one with a number of males and a
number of females. I tested the relationship between the sex ratio
(proportion of males over the total) weighted for the number of
individuals of each year.
In R:
glm.1<-glm(cbind(males,females)~predictor,binomial,data=data)
With this aim I prepared a set of candidate models each one representing
a specific biological hypothesis. I work with two data sets because I used two
sexing methods and in one data set I have some extra individuals sexed each
year with another method. Hence data sets have different sample sizes (min=14
and 43, max=880 and 950, mean=244 and 324, respectively). One
identical set and analysis for each data set. I considered four
predictors but each model contained at most two predictors (one
categorical predictor plus one of the other three that, instead, are
continuous). The categorical predictor has a clear effect on the sex
ratio as resulting from simple plotting of data and by logic beyond the
hypothesis it depicts. I know both analyses are at risk of being
overparameterized but I trust that QAICc (Akaike Information Criterion
corrected for small samples and overdispersion) had ride of this
problem.
In fact, for the smaller data set I don't find any clear pattern and, as
a result, the null (only intercept) model performs as well as the one
considering the categorical predictor. I report the QAICc (c-hat=1.2)
ranking and as a measure of the effect size, the Nagelkerkes Pseudo-R2,
that in this case, for the best ranked non-null model (the categorical
predictor) is about 0.3.
For the bigger data set I find very clear results and the model
accounting for the categorical predictor plus another (continuous)
predictor is ranked first at more than six deltaAICc (c-hat=1) from the
next one (the one with only the categorical predictor).
In this case, the Nagelkerkes Pseudo-R2 is about 0.95 and I feel somehow
uncomfortable with that high optimistic estimate.
In R the Nagelkerkes Pseudo-R2 was computed following Faraway (2006) as:
R2.nagelkerke<-(1-exp((glm.1$dev -
glm.1$null)/nrow(data)))/(1-exp(-glm.1$null/nrow(data)))
Any opinion/suggestion on this case?
Faraway, J. J. (2006). Extending the Linear Model with R. Boca Raton. FL:
Chapman & Hall/CRC.
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.