Hello R users, I'm working with a time-series of several years and to analyze it, I’m using GAM smoothers from the package mgcv. I’m constructing models where zooplankton biomass (bm) is the dependent variable and the continuous explanatory variables are: -time in Julian days (t), to creat a long-term linear trend -Julian days of the year (t_year) to create an annual cycle - Mean temperature of Winter (temp_W), Temperature of September (temp_sept) or Chla. Questions: 1) To introduce a tensor product modifying the annual cycle in my model, I tried 2 different approaches: - a) gam ( bm ~ t + te (t_year, temp_W, temp_sept, k = c( 5,30), d= ( 1,2), bs = c( “cc”,”cr”)), data = data) -b) gam ( bm ~ t + te (t_year, temp_W, temp_sept, k = 5, bs = c( “cc”,”cr”,”cr”)), data = data) Here is my problem: when I’m using just 2 variables (e.g., t_year and temp_W) for the tensor product, I can understand pretty well how the interpolation works and visualize it with vis.gam() as a 3d plot or a contour one. But with 3 variables is difficult to me to understand how it works. Besides, I don’t which one is the proper way to construct it, a) or b). Finally, when I plot a) or b) as vis.gam (model_name , view= c(“t_year”, “temp_W”)), How should I interpret the plot? The effect of temp_W on the annual cycle after considering already the effect of temp_sept or just the individual effect of Temp_W on the annual cycle? 2) I’m trying to do a model selection using AIC criteria. I have several questions about it: - Should I use always the same type of smoothing basis (bs), the same type of smoother ( e.g te) and the same dimension of the basis (k)? Example: Option 1: a) mod1 <- gam (bm ~ t, data = data) b) mod2 <- gam (bm ~ te (t, k = 5, bs = “cr”), data = data) c) mod3 <- gam (bm ~ te (t_year, k = 5, bs = “cc”), data = data) d) mod4 <- gam (bm ~ te (t_year, temp_W, k = 5, bs = c(“cc”,”cr”)), data = data) e) mod5 <- gam (bm ~ te (t_year, temp_W, temp_sept, k = 5, bs = c(“cc”,”cr”,”cr”)), data = data). Here the limitation for k = 5, is due to mod5, I don’t use s () because in mod4 and mod5 te () is used and finally, I always use “cr” and “cc”. Option 2: a) mod1 <- gam (bm ~ t, data = data) b) mod2 <- gam (bm ~ s (t, k = 13, bs = “cr”), data = data) c) mod3 <- gam (bm ~ s (t_year, k = 13, bs = “cc”), data = data) d) mod4 <- gam (bm ~ te (t_year, temp_W, k = 11, bs = c(“cc”,”cr”)), data = data) e) mod5 <- gam (bm ~ te (t_year, temp_W, temp_sept, k = 5, bs = c(“cc”,”cr”,”cr”)), data = data). I can get lower AIC for each of the models with Option 2, but are they comparable when I use AIC criteria? Is it therefore the proper way to do it as in Option 1? AIC (mod1, mod2, mod3, mod4, mod5).
Thank you in advance, Best regards, Ricardo González-Gil -- View this message in context: http://r.789695.n4.nabble.com/te-interactions-and-AIC-model-selection-with-GAM-tp4638368.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.