Dear R-help users, I am trying to analyze some visual transect data of organisms to generate a habitat distribution model. Once organisms are sighted, they are followed as point data is collected at a given time interval. Because of the autocorrelation among these "follows," I wish to utilize a GAM-GEE approach similar to that of Pirotta et al. 2011, using packages 'yags' and 'splines' (http://www.int-res.com/abstracts/meps/v436/p257-272/). Their R scripts are shown here ( http://www.int-res.com/articles/suppl/m436p257_supp/m436p257_supp1-code.r). I have used this code with limited success and multiple issues of models failing to converge.
Below is the structure of my data: > str(dat2) 'data.frame': 10792 obs. of 4 variables: $ dist_slag : num 26475 26340 25886 25400 24934 ... $ Depth : num -10.1 -10.5 -16.6 -22.2 -29.7 ... $ dolphin_presence: int 0 0 0 0 0 0 0 0 0 0 ... $ block : int 1 1 1 1 1 1 1 1 1 1 ... > head(dat2) dist_slag Depth dolphin_presence block 1 26475.47 -10.0934 0 1 2 26340.47 -10.4870 0 1 3 25886.33 -16.5752 0 1 4 25399.88 -22.2474 0 1 5 24934.29 -29.6797 0 1 6 24519.90 -26.2370 0 1 Here is the summary of my block variable (indicating the number of groups for which autocorrelation exists within each block > summary(dat2$block) Min. 1st Qu. Median Mean 3rd Qu. Max. 1.00 39.00 76.00 73.52 111.00 148.00 However, I would like to use the package 'gamm4', as I am more familiar with Professor Simon Wood's packages and functions, and it appears gamm4 might be the most appropriate. It is important to note that the models have a binary response (organism presence of absence along a transect), and thus why I think gamm4 is more appropriate than gamm. In the gamm help, it provides the following example for autocorrelation within factors: ## more complicated autocorrelation example - AR errors ## only within groups defined by `fac' e <- rnorm(n,0,sig) for (i in 2:n) e[i] <- 0.6*e[i-1]*(fac[i-1]==fac[i]) + e[i] y <- f + e b <- gamm(y~s(x,k=20),correlation=corAR1(form=~1|fac)) Following this example, the following is the code I used for my dataset b <- gamm4(dolphin_presence~s(dist_slag)+s(Depth),random=(form=~1|block), family=binomial(),data=dat) However, by examining the output (summary(b$gam)) and specifically summary(b$mer)), I am either unsure of how to interpret the results, or do not believe that the autocorrelation within the group is being taken into consideration. > summary(b$gam) Family: binomial Link function: logit Formula: dolphin_presence ~ s(dist_slag) + s(Depth) Parametric coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -13.968 5.145 -2.715 0.00663 ** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Approximate significance of smooth terms: edf Ref.df Chi.sq p-value s(dist_slag) 4.943 4.943 70.67 6.85e-14 *** s(Depth) 6.869 6.869 115.59 < 2e-16 *** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 R-sq.(adj) = 0.317glmer.ML score = 10504 Scale est. = 1 n = 10792 > > summary(b$mer) Generalized linear mixed model fit by the Laplace approximation AIC BIC logLik deviance 10514 10551 -5252 10504 Random effects: Groups Name Variance Std.Dev. Xr s(dist_slag) 1611344 1269.39 Xr.0 s(Depth) 98622 314.04 Number of obs: 10792, groups: Xr, 8; Xr.0, 8 Fixed effects: Estimate Std. Error z value Pr(>|z|) X(Intercept) -13.968 5.145 -2.715 0.00663 ** Xs(dist_slag)Fx1 -35.871 33.944 -1.057 0.29063 Xs(Depth)Fx1 3.971 3.740 1.062 0.28823 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Correlation of Fixed Effects: X(Int) X(_)F1 Xs(dst_s)F1 0.654 Xs(Dpth)Fx1 -0.030 0.000 > How do I ensure that autocorrelation is indeed being accounted for within each unique value of the "block" variable? What is the simplest way to interpret the output for "summary(b$mer)"? The results do differ from a normal gam (package mgcv) using the same variables and parameters without the "correlation=..." term, indicating that *something *different is occurring. However, when I use a different variable for the correlation term (season), I get the SAME output: > dat2 <- data.frame(dist_slag = dat$dist_slag, Depth = dat$Depth, > dolphin_presence = dat$dolphin_presence, + block = dat$block, season=dat$season) > head(dat2) dist_slag Depth dolphin_presence block season 1 26475.47 -10.0934 0 1 F 2 26340.47 -10.4870 0 1 F 3 25886.33 -16.5752 0 1 F 4 25399.88 -22.2474 0 1 F 5 24934.29 -29.6797 0 1 F 6 24519.90 -26.2370 0 1 F > summary(dat2$season) F S 3224 7568 > b <- gamm4(dolphin_presence~s(dist_slag)+s(Depth),correlation=corAR1(1, > form=~1 | season), family=binomial(),data=dat2) > summary(b$gam) Family: binomial Link function: logit Formula: dolphin_presence ~ s(dist_slag) + s(Depth) Parametric coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -13.968 5.145 -2.715 0.00663 ** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Approximate significance of smooth terms: edf Ref.df Chi.sq p-value s(dist_slag) 4.943 4.943 70.67 6.85e-14 *** s(Depth) 6.869 6.869 115.59 < 2e-16 *** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 R-sq.(adj) = 0.317glmer.ML score = 10504 Scale est. = 1 n = 10792 > summary(b$mer) Generalized linear mixed model fit by the Laplace approximation AIC BIC logLik deviance 10514 10551 -5252 10504 Random effects: Groups Name Variance Std.Dev. Xr s(dist_slag) 1611344 1269.39 Xr.0 s(Depth) 98622 314.04 Number of obs: 10792, groups: Xr, 8; Xr.0, 8 Fixed effects: Estimate Std. Error z value Pr(>|z|) X(Intercept) -13.968 5.145 -2.715 0.00663 ** Xs(dist_slag)Fx1 -35.871 33.944 -1.057 0.29063 Xs(Depth)Fx1 3.971 3.740 1.062 0.28823 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Correlation of Fixed Effects: X(Int) X(_)F1 Xs(dst_s)F1 0.654 Xs(Dpth)Fx1 -0.030 0.000 > I just want to make sure it is correctly allowing for correlation within each value for the "block" variable. How do I formulate the model to say that autocorrelation can exist within each single value for block, but assume independence among blocks? On another note, I am also receiving the following warning message after model completion for larger models (with many more variables than 2): Warning message: In mer_finalize(ans) : false convergence (8) Thank you for your time, effort, and patience dealing with a novice. Sincerely, Nathan [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.