-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 12-05-02 12:29 PM, Matt Pennell wrote:
> Hi R-sig-phylo readers,
> 
> I have lately been thinking about an issue regarding model
> selection in trait evolutionary models and was wondering if anyone
> on the list had any insight into this question:
> 
> It is now commonplace for researchers to use some model selection
> criterion such as AIC/AICc/BIC to select a model of trait
> evolution. In their book, Burnham and Anderson discuss how the
> derivation of AIC only holds if the number of data points is much
> larger than the number of parameters (they suggest, roughly that
> n/k > 40). If this is not true, they provide a small-sample size
> correction for the AIC (AICc) which explicitly takes into account
> the number of observations. Similarily BIC, includes the number of 
> observations in the formulation.
> 
> My question is: how many observations do we have when we compare
> trait evolutionary models? People tend to use the number of tips of
> taxa for which we have trait values. However, this may not be
> technically accurate. First, of course, both the branch lengths and
> the tip values factor into the likelihood equations so it seems
> sensible that these are both somehow included as observations.
> Second, the trait values we observe are of course not independent
> (that is the whole reason we are using a phylogeny in the first
> place!!). It is unclear whether/how this fact should factor into
> our calculation of the n. I know that it phylogenetics, when people
> do model selection for the model of sequence evolution, they use
> the number of sites in the alignment though i am not sure there is
> a clear justification for this either. I was just wondering what
> people thought about this. Boettiger et al. (2012) showed that the
> choice of the evolutionary model for moderately sized phylogenies
> is very different when using AIC vs AICc so I think this may be
> worth some serious consideration.
> 
> Any thoughts?
> 
> cheers, matt

  Extremely interesting, extremely giant can of worms.  Over on the
mixed model side of the world people have been discussing this for
years ...

* "effective number of observations" is probably not always a
precisely defined concept (it may depend on what you're trying to use
the number for)
* it may depend on the scale at which you're defining the 'best'
model.  In particular, with both the conditional AIC defined by Vaida
and Blanchard (I think the ref. is 2005), and with the deviance
information criterion (Spiegelhalter et al), both of which are
attempting to measure 'effective number of parameters' at some scale,
the correct definition depends on whether you are trying to maximize
predictive accuracy at the scale of individual units (taxa) or at the
scale of the population (e.g. predictions for as-yet-unmeasured taxa,
or for the expected effect a change in some covariate applied to a
randomly sampled taxon) -- in DIC this is referred to as the "level of
focus".  There is a nice blog post by Bob O'Hara on the topic.
 * Haven't looked at Boettiger et al 2012, but we should be reminded
that the AICc was derived in a very particular context (linear models)
and has been extrapolated *far* beyond that context -- it's reasonable
as a rule of thumb, but we also shouldn't be surprised if it fails
sometimes (Shane Richards has an Ecology paper where he shows that it
can do poorly in a GLM context)

  Ben Bolker

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJPoWg2AAoJED2whTVMEyK9ogMH/0CPsDFA17Qj+E6QJL1t05ki
tLSyXc/Z1Tn3ORCjJi02HIrwxI5JmlgCvr5kJs0XClBnPlMdMwFVXZ4k8EOC7efq
Vw9siJ9ygCm6D0xnvjnNwJMtjaRUvL8Ybsz1XN/8Db8gDk56StvFeei+VnQrPRVc
kt7wcikU+5R6cQBwHZXpscwW8IOip9KTBqu6o4c0syPCfdQqY+6k/HyB0x12j72H
tAixJY9nxE4jqtK1/jgs71W1Nscfzq0Ce3AbXjWQToouOAFTcsL55qJ8Ksc/I+a+
VlwEm3WiNTXLyQrSeFEoxWgKDzTk9A3jsg65iBnrB4s/hj3CzK68Zi5kp77JrAc=
=YQA9
-----END PGP SIGNATURE-----

_______________________________________________
R-sig-phylo mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo

Reply via email to