> On 23 Nov 2016, at 23:32 , Simon Wood <simon.w...@bath.edu> wrote: > > ?predict.gam (mgcv) says.... > > "Note that, in common with other prediction functions, any offset > supplied to ‘gam’ as an argument is always ignored when > predicting, unlike offsets specified in the gam model formula." >
> .... which was originally implemented to prevent surprises to people familiar > with predict.lm (which used to behave that way, and was documented to behave > that way).... the problem is that predict.lm doesn't behave like that any > more, so I guess at some point I should remove this feature... > Fortune candidate... -pd > best, > Simon > > > > > On 23/11/16 00:24, Marine Regis wrote: >> Thanks a lot for your answers. >> Peter: sorry, here is the missing information: >> >> * I use the function gam() of the package �mgcv� >> * Yes, the output changes when I use offset(log_trap_eff) instead of >> offset=log_trap_eff. By using offset(log_trap_eff), the output is more >> coherent with the observed values. Here are the new predictions: >> >>> summary(mod$fit) >>> >> Min. 1st Qu. Median Mean 3rd Qu. Max. >> 10.01 68.14 85.71 83.16 101.00 130.20 >> >> >> * I have tried to create a reproductive example to show the difference >> between offset(log_trap_eff) and offset=log_trap_eff. >> nb_unique <- rnegbin(58, mu=82, theta=13.446) >> x <- runif(58,min=-465300,max=435200) >> prop_forest <- runif(58,min=0,max=1) >> log_trap_eff <- runif(58,min=4,max=6) >> >> With offset=log_trap_eff: >> >> mod1 <- gam(nb_unique ~ s(x,prop_forest), offset=log_trap_eff, >> family=nb(theta=NULL, link="log"), method = "REML", select = TRUE) >> >> >>> mod1Pred <- predict.gam(mod1, se.fit=TRUE, type="response") >>> >>> summary(mod1Pred$fit) >>> >> Min. 1st Qu. Median Mean 3rd Qu. Max. >> >> 0.5852 0.5852 0.5852 0.5852 0.5852 0.5852 >> >> >> With offset(log_trap_eff): >> mod2 <- gam(nb_unique ~ s(x,prop_forest) + offset(log_trap_eff), >> family=nb(theta=NULL, link="log"), method = "REML", select = TRUE) >> >> >>> mod2Pred <- predict.gam(mod2, se.fit=TRUE, type="response") >>> >>> summary(mod2Pred$fit) >>> >> Min. 1st Qu. Median Mean 3rd Qu. Max. >> >> 32.03 61.18 97.20 112.20 165.00 226.00 >> >> >> >> Value range of observed data: >> >> >>> summary(nb_unique) >>> >> Min. 1st Qu. Median Mean 3rd Qu. Max. >> >> 43.00 67.00 81.00 84.16 92.75 153.00 >> >> >> * By using fitted(mod), I obtain NULL. >> I am a novice in GAMs. So, I don�t know why the results are different >> between models with offset=argument and offset(). >> Thanks a lot for your help. >> Have a nice day >> Marine >> >> >> >> ________________________________ >> De : peter dalgaard >> <pda...@gmail.com> >> >> Envoy� : mardi 22 novembre 2016 23:52 >> � : Bert Gunter >> Cc : Marine Regis; >> r-help@r-project.org >> >> Objet : Re: [R] GAM with the negative binomial distribution: why do >> predictions no match with original values? >> >> >> >>> On 22 Nov 2016, at 23:07 , Bert Gunter <bgunter.4...@gmail.com> >>> wrote: >>> >>> Define "very different." Sounds like a subjective opinion to me, for >>> which I have no response. Apparently others are similarly flummoxed. >>> Of course they would not in general be identical. >>> >> Er? I don't see much reason to disagree that a range 0.10-0.18 is different >> from 17-147. >> >> However, other bits of information are missing: We don't know which gam() >> function is being used (to my knowledge there is one in package gam but also >> one in mgcv). We don't have the data, so we cannot reproduce and try to find >> the root of the problem. >> >> Offhand, it looks like the predict.gam() function is misbehaving, which >> could have something to do with the offset term and/or the nb dispersion >> parameter. On a hunch, does anything change if you use >> >> nb_unique ~ s(x,prop_forest) + offset(log_trap_eff) >> >> instead of the offset= argument? And, by the way, does fitted(mod,...) >> change anything? >> >> -pd >> >> >>> Cheers, >>> Bert >>> >>> >>> Bert Gunter >>> >>> "The trouble with having an open mind is that people keep coming along >>> and sticking things into it." >>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) >>> >>> >>> On Tue, Nov 22, 2016 at 1:29 PM, Marine Regis >>> <marine.re...@hotmail.fr> >>> wrote: >>> >>>> Hello, >>>> >>>> >>>>> From capture data, I would like to assess the effect of longitudinal >>>>> changes in proportion of forests on abundance of skunks. To test this, I >>>>> built this GAM where the dependent variable is the number of unique >>>>> skunks and the independent variables are the X coordinates of the >>>>> centroids of trapping sites (called "X" in the GAM) and the proportion of >>>>> forests within the trapping sites (called "prop_forest" in the GAM): >>>>> >>>> mod <- gam(nb_unique ~ s(x,prop_forest), offset=log_trap_eff, >>>> family=nb(theta=NULL, link="log"), data=succ_capt_skunk, method = "REML", >>>> select = TRUE) >>>> summary(mod) >>>> >>>> Family: Negative Binomial(13.446) >>>> Link function: log >>>> >>>> Formula: >>>> nb_unique ~ s(x, prop_forest) >>>> >>>> Parametric coefficients: >>>> Estimate Std. Error z value Pr(>|z|) >>>> (Intercept) -2.02095 0.03896 -51.87 <2e-16 *** >>>> --- >>>> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 >>>> >>>> Approximate significance of smooth terms: >>>> edf Ref.df Chi.sq p-value >>>> s(x,prop_forest) 3.182 29 17.76 0.000102 *** >>>> --- >>>> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 >>>> >>>> R-sq.(adj) = 0.37 Deviance explained = 49% >>>> -REML = 268.61 Scale est. = 1 n = 58 >>>> >>>> >>>> I built a GAM for the negative binomial family. When I use the function >>>> `predict.gam`, the predictions of capture success from the GAM and the >>>> values of capture success from original data are very different. What is >>>> the reason for differences occur? >>>> >>>> **With GAM:** >>>> >>>> modPred <- predict.gam(mod, se.fit=TRUE,type="response") >>>> summary(modPred$fit) >>>> Min. 1st Qu. Median Mean 3rd Qu. Max. >>>> 0.1026 0.1187 0.1333 0.1338 0.1419 0.1795 >>>> >>>> **With original data:** >>>> >>>> summary(succ_capt_skunk$nb_unique) >>>> Min. 1st Qu. Median Mean 3rd Qu. Max. >>>> 17.00 59.00 82.00 81.83 106.80 147.00 >>>> >>>> The question has already been posted on Cross validated ( >>>> http://stats.stackexchange.com/questions/247347/gam-with-the-negative-binomial-distribution-why-do-predictions-no-match-with-or >>>> ) without success. >>>> >> [http://cdn.sstatic.net/Sites/stats/img/apple-touch-i...@2.png?v=344f57aa10cc&a]<http://stats.stackexchange.com/questions/247347/gam-with-the-negative-binomial-distribution-why-do-predictions-no-match-with-or> >> >> >> GAM with the negative binomial distribution: why do predictions no match >> with original values? >> <http://stats.stackexchange.com/questions/247347/gam-with-the-negative-binomial-distribution-why-do-predictions-no-match-with-or> >> >> stats.stackexchange.com >> >From capture data, I would like to assess the effect of longitudinal >> >changes in proportion of forests on abundance of skunks. To test this, I >> >built this GAM where the dependent variable is the numb... >> >> >> >> >>>> Thanks a lot for your time. >>>> Have a nice day >>>> Marine >>>> >>>> >>>> [[alternative HTML version deleted]] >>>> >>>> ______________________________________________ >>>> >>>> R-help@r-project.org >>>> mailing list -- To UNSUBSCRIBE and more, see >>>> >>>> https://stat.ethz.ch/mailman/listinfo/r-help >> thz.ch/mailman/listinfo/r-help> >> stat.ethz.ch >> The main R mailing list, for announcements about the development of R and >> the availability of new code, questions and answers about problems and >> solutions using R ... >> >> >> >> >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> >>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>> ______________________________________________ >>> >>> R-help@r-project.org >>> mailing list -- To UNSUBSCRIBE and more, see >>> >>> https://stat.ethz.ch/mailman/listinfo/r-help >> thz.ch/mailman/listinfo/r-help> >> stat.ethz.ch >> The main R mailing list, for announcements about the development of R and >> the availability of new code, questions and answers about problems and >> solutions using R ... >> >> >> >> >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> >>> and provide commented, minimal, self-contained, reproducible code. >>> >> -- >> Peter Dalgaard, Professor, >> Center for Statistics, Copenhagen Business School >> Solbjerg Plads 3, 2000 Frederiksberg, Denmark >> Phone: (+45)38153501 >> Office: A 4.23 >> Email: >> pd....@cbs.dk Priv: pda...@gmail.com >> >> >> >> >> >> >> >> >> >> >> >> [[alternative HTML version deleted]] >> >> >> >> >> ______________________________________________ >> >> R-help@r-project.org >> mailing list -- To UNSUBSCRIBE and more, see >> >> https://stat.ethz.ch/mailman/listinfo/r-help >> >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> >> and provide commented, minimal, self-contained, reproducible code. >> > > > -- > Simon Wood, School of Mathematics, University of Bristol BS8 1TW UK > +44 (0)117 33 18273 > http://www.maths.bris.ac.uk/~sw15190 -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd....@cbs.dk Priv: pda...@gmail.com ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.