On Oct 20, 2009, at 4:00 PM, Luciano La Sala wrote:
Dear R-people,
I am analyzing epidemiological data using GLMM using the lmer
package. I usually explore the assumption of linearity of continuous
variables in the logit of the outcome by creating 4 categories of
the variable, performing a bivariate logistic regression, and then
plotting the coefficients of each category against their mid points.
That gives me a pretty good idea about the linearity assumption and
possible departures from it.
I know of people who create 0,1 dummy variables in order to relax
the linearity assumption. However, I've read that dummy variables
are never needed (nor are desireble) in R! Instead, one should make
use of factors variable. That is much easier to work with than dummy
variables and the model itself will create the necessary dummy
variables.
Having said that, if my data violates the linearity assumption, does
the use of a factors for the variable in question helps overcome the
lack of linearity?
No. If done by dividing into samall numbers of categories after
looking at the data, it merely creates other (and probably more
severe) problems. If you are in the unusal (although desirable)
position of having a large number of events across the range of the
covariates in your data, you may be able to cut your variable into
quintiles or deciles and analyze the resulting factor, but the
preferred approach would be to fit a regression spline of sufficient
complexity.
Thanks in advance.
--
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.