Dear R help, I am trying to work out if I am justified in log-transforming data and specifying Gamma in the same glm. Does it have to be one or the other? I have attached an R script and the datafile to show what I mean. Also, I cannot find a mixed-model that allows Gamma errors (so I cannot find a way of including random effects). What should I do? Many thanks, Pete
#trying to solve question 'can you log-transform and specify Gamma in the same model' question ToadsBd<-read.table(file.choose(),header=T) list(ToadsBd) #first see how well treatment group predicts Bd score with non-log transformed data mod1<-glm(Bd~factor(group)) summary(mod1) #massively overdispersed. Are the data non-normal? shapiro.test(Bd) W = 0.3652, p-value = 5.666e-13 #yes, definitely non-normal #try log-transforming data and see if that helps plot(qqnorm(Bd),log="y") #log plot straightens it out, almost, so yes log-transform helps #try model again with log transformed Bd score mod2<-glm(logBd~factor(group)) summary(mod2) #a big improvement but still overdispersed #other options - specify an error family? Looks like original data are Gamma distributed #should test if variance increases or remains constant with mean on scale of the original, non-logged data par(mfrow=c(2,2)) plot(mod1) #can you tell this from a diagnostic plot? Not sure how. If not, how do you assess this? #in the meantime, assume it does and try Gamma (using default link = reciprocal) with non-logged data mod3<-glm(Bd~factor(group),family=Gamma) summary(mod3) #mod3 is a major improvement on mod1 and less dispersed than mod2 but has a much larger AIC than mod2 #is it valid to specify Gamma in a model where the data have been log-transformed? #or does it have to be a choice between transformation or Gamma? #if specify both, model is quite good, but it may not be valid. Please help! mod4<-glm(logBd~factor(group),family=Gamma) summary(mod4) #residual deviance now well below df, not overdispersed and the effect of group on Bd is significant #I would also like to include assessment of the effect of site, but this is a random effect requiring a mixed model #I cannot find a mixed model that works with Gamma errors. What can I do?
toad group Bd logBd startg site 1 1 0.5 0.405 13.6 0 2 1 0.3 0.262 15.9 0 3 1 0.3 0.262 14.4 0 4 1 0.4 0.336 15.3 0 5 1 6.5 2.015 15.1 0 6 1 0.1 0.095 15.7 0 7 1 0.2 0.182 20.2 0 8 1 17.7 2.929 17.3 0 9 1 0.6 0.470 18.7 0 10 1 0.1 0.095 24.6 1 11 1 0.6 0.470 20 1 12 1 9 2.303 16.3 1 13 1 1.6 0.956 19.4 1 14 1 3.4 1.482 12.8 1 15 1 6.3 1.988 19.7 1 16 2 1.3 0.833 12.6 0 17 2 63.3 4.164 22.6 0 18 2 0.7 0.531 18.3 0 19 2 33.2 3.532 15.5 0 20 2 2.2 1.163 13.2 0 21 2 479 6.174 16.4 0 22 2 0.1 0.095 19.1 0 23 2 47.6 3.884 16.1 0 24 2 195.6 5.281 14.1 0 25 2 41 3.738 16.3 0 26 2 1984.2 7.593 13.7 1 27 2 6.3 1.988 13.9 1 28 2 126.7 4.850 22 1 29 2 105.1 4.664 12.7 1 30 2 6747.8 8.817 18.2 1 31 2 282.6 5.648 15.8 1 32 3 1.6 0.956 18.6 0 33 3 2576.3 7.854 15.3 0 34 3 11240 9.327 17.4 0 35 3 678.1 6.521 18.8 0 36 3 9926.8 9.203 17.5 0 37 3 103.4 4.648 16.1 0 38 3 2401.7 7.784 15.5 0 39 3 2616.4 7.870 16.5 0 40 3 35.3 3.592 18.9 0 41 3 174.7 5.169 22.7 0 42 3 362 5.894 17.5 1 43 3 2765.7 7.925 13.8 1 44 3 29033.8 10.276 16.5 1 45 3 34 3.555 21.1 1 46 3 258.4 5.558 15.9 1 47 3 10.1 2.407 14.9 1
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.