Re: [R] Appropriate specification of random effects structure for EEG/ERP data: including Channels or not?

Paolo Canal Mon, 28 Sep 2015 01:45:32 -0700

Thank you Philllip,

And sorry being late in the response. Thanks for the reference, Ibelieve that many of the published papers on ERPs with mixed models havedescriptions of the analysis that often lacks of detail (even whenlooking for non *.linguistic journals).

Concerning your comments: I understand that nuisance factors are notrandom effects. At the same time the eeg amplitude is recorded fromseveral electrodes which are units of observations around which the dataare clustered. In the paper you suggested they write:

"In simplified models, by-channel random slope parameters were estimatedat zero, resulting in failures to converge to an optimal solution. Thislikely reflects the limited variance in effects across the selectedcentro-parietal channels due to volume conduction. Therefore, randomslopes of effects across channels were not fit in final models." Thisobservation is not far from your point about adjacent electrodes beingvery highly correlated with each other, but I guess this would emerge inthe random effects covariance matrix, only when asking for thecalculation of by-channel random slopes or intercepts for some of thefactors.

Therefore they used the term, only calculating adjustments to theintercepts, and I would have done the same just to increase fit, becauseI would say 1|ch better describes the data structure to lmer. But forpragmatism or necessity or parsimony, I'll likely forget about this (Iam already trying to fit more than 80 parameters in the random structureso I do not want to be too strict about the inclusion of channels).

I keep your suggestion about modeling the topographic factors using theXYZ scalp-coordinates for the next studies when I'll have a better graspon mixed models and some more computational power ;).

Thanks again for all your insights (I will refer to the special interestgroup in the next future).

best
Paolo


On 28/09/2015 04:17, Phillip Alday wrote:

You might also want to take a look at the recent paper from the Federmeier 
group, especially the supplementary materials. There are a few technical 
inaccuracies (ANOVA is a special case of hierarchical modelling, not the other 
way around), but they discuss some of the issues involved. And relevant for 
your work: they model channel as a grouping variable in the random-effects 
structure.

Payne, B. R., Lee, C.-L., and Federmeier, K. D. (2015). Revisiting the 
incremental effects of context on word processing: Evidence from single-word 
event-related brain potentials. Psychophysiology.

http://dx.doi.org/10.1111/psyp.12515

Best,
Phillip

On 24 Sep 2015, at 22:42, Phillip Alday <phillip.al...@unisa.edu.au> wrote:

There is actually a fair amount of ERP literature using mixed-effects
modelling, though you may have to branch out from the traditional
psycholinguistics journals a bit (even just more "neurolinguistics" or
language studies published in "psychology" would get you more!). But
just in the traditional psycholinguistics journals, there is a wealth of
literature, see for example the 2008 special issue on mixed models of
the Journal of Memory and Language.

I would NOT encode the channels/ROIs/other topographic measures as
random effects (grouping variables). If you think about the traditional
ANOVA analysis of ERPs, you'll recall that ROI or some other topographic
measure (laterality, saggitality) are included in the main effects and
interactions. As a rule of thumb, this corresponds to a fixed effect in
random effects models. More specifically, you generally care about
whether the particular levels of the topographic measure (i.e. you care
if an ERP component is located left-anterior or what not) and this is
what fixed effects test. Random effects are more useful when you only
care about the variance introduced by a particular term but not the
specific levels (e.g. participants or items -- we don't care about a
particular participant, but we do care about how much variance there is
between participants, i.e. how the population of participants looks).

Or, another thought: You may have seen ANOVA by-subjects and by-items,
but I bet you've never seen an ANOVA by-channels. ANOVA "implicitly"
collapses the channels within ROIs and you can do the same with mixed
models. (That's an awkward statement technically, but it should help
with the intuition.)

There is an another, related important point -- "nuisance parameters"
aren't necessarily random effects. So even if you're not interested in
the per-electrode distribution of the ERP component, that doesn't mean
those should automatically be random effects. It *might* make sense to
add a channel (as in per-electrode) random effect, if you care to model
the variation within a given ROI (as you have done), but I haven't seen
that yet. It is somewhat rare to include a per-channel fixed effect,
just because you lose a lot of information that way and introduce more
parameters into the model, but you could include a more fine-grained
notion of saggital / lateral location based on e.g. the 10-20 system and
make that into an ordered factor. (Or you could be extreme and even use
the spherical coordinates that the 10-20 is based on and have continuous
measures of electrode placement!) The big problem with including
"channel" as a random-effect grouping variable is that the channels
would have a very complicated covariance structure (because adjacent
electrodes are very highly correlated with each other) and I'm not sure
how to model this in a straightforward way with lme4.

More generally, in considering your random effects structure, you should
look at Barr et al (2013, "Random effects structure for confirmatory
hypothesis testing: Keep it maximal") and the recent reply by Bates et
al (arXiv, "Parsimonious Mixed Models"). You should read up on the GLMM
FAQ on testing random effects -- there are different opinions on this
and not all think that testing them via likelihood-ratio tests makes
sense.

That wasn't my most coherent response, but maybe it's still useful. And
for questions like this on mixed models, do check out the R Special
Interest Group on Mixed Models. :-)

Best,
Phillip

On Thu, 2015-09-24 at 12:00 +0200, r-help-requ...@r-project.org wrote:

Message: 4
Date: Wed, 23 Sep 2015 12:46:46 +0200
From: Paolo Canal <paolo.ca...@iusspavia.it>
To: r-help@r-project.org
Subject: [R] Appropriate specification of random effects structure for
        EEG/ERP data: including Channels or not?
Message-ID: <56028316.2050...@iusspavia.it>
Content-Type: text/plain; charset="UTF-8"

Dear r-help list,

I work with EEG/ERP data and this is the first time I am using LMM to
analyze my data (using lme4).
The experimental design is a 2X2: one manipulated factor is
agreement,
the other is noun (agreement being within subjects and items, and
noun
being within subjects and between items).

The data matrix is 31 subjects * 160 items * 33 channels. In ERP
research, the distribution of the EEG amplitude differences (in a
time
window of interest) are important, and we care about knowing whether
a
negative difference is occurring in Parietal or Frontal electrodes.
At
the same time information from single channel is often too noisy and
channels are organized in topographic factors for evaluating
differences
in distribution. In the present case I have assigned each channel to
one
of three levels of two factors, i.e., Longitude (Anterior, Central,
Parietal) and Medial (Left, Midline, Right): for instance, one
channel
is Anterior and Left. With traditional ANOVAs channels from the same
level of topographic factors are averaged before variance is
evaluated
and this also has the benefit of reducing the noise picked up by the
electrodes.

I have troubles in deciding the random structure of my model. Very
few
examples on LMM on ERP data exist (e.g., Newman, Tremblay, Nichols,
Neville & Ullman, 2012) and little detail is provided about the
treatment of channel. I feel it is a tricky term but very important
to
optimize fit. Newman et al say "data from each electrode within an
ROI
were treated as repeated measures of that ROI". In Newman et al, the
ROIs are the 9 regions deriving from Longitude X Medial
(Anterior-Left,
Anterior-Midline, Anterior-Right, Central-Left ... and so on), so in
a
way they treated each ROI separately and not according to the
relevant
dimensions of Longitude and Medial.

We used the following specifications in lmer:

[fixed effects specification: ?V ~ Agreement * Noun * Longitude *
Medial
* (cov1 + cov2 + cov3 + cov4)] (the terms within brackets are a
series
of individual covariates, most of which are continuous variables)

[random effects specification: (1+Agreement*Type of Noun | subject) +
(1+Agreement | item) + (1|longitude:medial:channel)]

What I care the most about is the last term
(1|longitude:medial:channel). I chose this specification because I
thought that allowing each channel to have different intercepts in
the
random structure would affect the estimation of the topographic fixed
effects (Longitude and Medial) in which channel is nested.
Unfortunately
a reviewer commented that since "channel is not included in the fixed
effects I would probably leave that out".

But each channel is a repeated measure of the eeg amplitude inside
the
two topographic factors, and random terms do not have to be in the
fixed
structure, otherwise we would also include subjects and items in the
fixed effects structure. So I kind of feel that including channels as
random effect is correct, and having them nested in longitude:medial
allows to relax the assumption that the effect in the EEG has always
the
same longitude:medial distribution. But I might be wrong.

I thus tested differences in fit (ML) with anova() between
(1|longitude:medial:channel) and the same model without the term, and
a
third model with the model with a simpler (1|longitude:medial).

Fullmod vs Nochannel:

Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
modnoch 119 969479 970653 -484621 969241
fullmod 120 968972 970156 -484366 968732 508.73 1 < 2.2e-16 ***

Differences in fit is remarkable (no variance components with
estimates
close to zero; no correlation parameters with values close to ?1).

Fullmod vs SimplerMod:

   Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)

fullmod 120 968972 970156 -484366 968732
simplermod 120 969481 970665 -484621 969241 0 0 1

Here the number of parameters to estimate in fullmod and simplermod
is
the same but the increase in fit is very consistent (-509 BIC). So I
guess although the chisquare is not significant we do have a string
increase in fit. As I understand this, a model with better fit will
find
more accurate estimates, and I would be inclined to keep the fullmod
random structure.

But perhaps I am missing something or I am doing something wrong.
Which
is the correct random structure to use?

Feedbacks are very much appreciated. I often find answers in the
list,
and this is the first time I post a question.
Thanks,
Paolo


______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Appropriate specification of random effects structure for EEG/ERP data: including Channels or not?

Reply via email to