If 'subjIndexF' is a factor for subject, then s(subjIndexF, bs="re") will produce a random effect for subject. i.e. each subject will be given its own random intercept term, which is a way that repeated measures data like this are often handled.

The reason for the s(subjIndexF, bs="re") syntax is that smooths can be viewed as Gaussian random effects, so simple Gaussian random effects can also be viewed as (0-dimensional) smooths. In general s(x,z,w,bs="re") just appends the columns of model.matrix(~x:z:w-1) to the gam model matrix, and treats the associated coefficients as i.i.d. Gaussian random effects with a common variance (to be estimated). In principle this works with any number of arguments to s(...,bs="re").

See ?random.effects (and its linked help files) in mgcv for more.

There are mechanisms for allowing random smooth curves for each subject, (e.g. ?factor.smooth.interaction), but I would only use these if simpler approaches really aren't adequate here.

best,
Simon



On 30/03/17 17:06, David Winsemius wrote:
On Mar 30, 2017, at 6:56 AM, Leon Lee <bhamlio...@gmail.com> wrote:

David

Thank you for your reply. I apologize if I posted in the wrong forum, as I 
really couldn't decide which forum is the best place for my question and I saw 
similar questions asked before in this forum.

I agree that a sample of ~30 subjects (70 scans in total), the model can be too 
complicated. Based on that, I did the following:
(1) ignored the gender effect, as we have less females than males.
(2) corrected chronological age based on their gestational age, that is, we 
subtracted an infant's chronological age by 2 weeks, if the infant's 
gestational age is 38 weeks instead of 40weeks.

When I ran the model with corrected age, gestational age and their interactions 
modeled, I found the main effect of gestational age and the interaction between 
the two are gone.

So, my final model will look something like this:
gamObj=gam(brainVolume~ s(correctedAge) +  s(subjIndexF, bs="re"), 
method="REML", data=mydata)

Does this look more reasonable?
I'm still having difficulty understand how a "smoothing" function would be used to handle 
repeated measures without some sort of "group-within" indicator.

I would have imagined (and this is because I have no experience with using this 
package for repeated measures) something along the lines of:

  ...+s(correctedAg|subjIndexF)

I see this statement in the docs:


smooth.construct.re.smooth.spec {mgcv}  

"gam can deal with simple independent random effects, by exploiting the link between smooths 
and random effects to treat random effects as smooths. s(x,bs="re") implements this."

But I don't see that as applying to the dependency between individuals measured 
repeatedly. I find no examples of repeated measures problems being solve by 
gam(). There is also a note on the same page:

"Note that smooth ids are not supported for random effect terms. Unlike most smooth 
terms, side conditions are never applied to random effect terms in the event of nesting 
(since they are identifiable without side conditions)."

When I do a search on "using gam mgcv formula mixed effects" I am referred to 
packages 'gamm' and 'gamm4' produced by the same author (Simon Wood) as pkg 'mgcv', or to 
package `nlme`.


Yes, I am relatively new to the mixed model. We originally applied functional 
data analysis (PACE) on the data, but want to see the results using a different 
approach. Also, I couldn't find the Mixed Models list, do you mind sending me a 
link?
This is a link to the main mailing lists page:

https://www.r-project.org/mail.html

Found with a search on Google with "R mailing lists"


Thank you!
Longchuan


On Tue, Mar 28, 2017 at 4:28 PM, David Winsemius <dwinsem...@comcast.net> wrote:

On Mar 28, 2017, at 9:32 AM, Leon Lee <bhamlio...@gmail.com> wrote:

Hi, R experts

I am new to R & GAM toolbox and would like to get inputs from you all on my
models. The question I have is as follows:
I have 30 subjects with each subject being scanned from one to three times
in the first year of life. The brain volume from each scan was measured.
The scan time was randomly distributed from birth to 1 year.
Each subject has different gestational age ranging from 38 to 41 weeks
Each subject has chronological age from birth to 1 year old
Each subject has gender category.
Now, I want to look at how predictors, such as subject's chronological age,
gestational age and gender will explain the changes in brain volume. I also
want to include interactions between gender and age, gestational and
chronological age. Random effects are also included in the model to account
for subject variability. My model looks like the follows:

gam=gam(brainVolume~ s(age) + ti(age, gestationalAge) + gestationalAge +
sex + s(age, by=sex) +  s(subjIndexF, bs="re"), method="REML", data=mydata)

Are there any obvious mistakes in the model? Any suggestions will be
greatly appreciated!
I'm not seeing mistakes in the syntax but I would question whether 30 subjects is 
sufficient to adequately support estimates in a a model of this complexity. I would also 
think that the 's(age)' and 'sex' terms would get aliased out in a model with "+ 
s(age, by=sex)". Most R regression functions handle removal of over-parametrization 
automatically.

You also have a variable number of measurements per subject. I am unable to comment on the effort 
to account for the implicit and variably measured correlation and auto-correlation of values within 
subjects using a "smooth" on subjIndexF, since that is not an approach I was familiar 
with.  But I am getting concerned whether you are also new to statistical modeling in addition to 
your use of R and GAM being "new to you"?

(Perhaps Simon or one of the mixed-effects experts can correct the gaps in my 
understanding of how to model repeated measures in the context of small numbers 
of subjects and irregular emasurements.)

Please read the Posting Guide and the pages of candidate mailing lists. Rhelp 
is not really the place to go when you need statistical advice. I'm not sure if 
this is really in the center of concerns that get discussed on the Mixed Models 
list, but to my eyes it would be a better fit there.

--
David.
L

       [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
David Winsemius
Alameda, CA, USA


David Winsemius
Alameda, CA, USA

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Simon Wood, School of Mathematics, University of Bristol BS8 1TW UK
+44 (0)117 33 18273     http://www.maths.bris.ac.uk/~sw15190

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to