Re: [R] Dummy variables or factors?

David Winsemius Tue, 20 Oct 2009 15:45:13 -0700


On Oct 20, 2009, at 4:00 PM, Luciano La Sala wrote:

Dear R-people,
I am analyzing epidemiological data using GLMM using the lmerpackage. I usually explore the assumption of linearity of continuousvariables in the logit of the outcome by creating 4 categories ofthe variable, performing a bivariate logistic regression, and thenplotting the coefficients of each category against their mid points.That gives me a pretty good idea about the linearity assumption andpossible departures from it.
I know of people who create 0,1 dummy variables in order to relaxthe linearity assumption. However, I've read that dummy variablesare never needed (nor are desireble) in R! Instead, one should makeuse of factors variable. That is much easier to work with than dummyvariables and the model itself will create the necessary dummyvariables.
Having said that, if my data violates the linearity assumption, doesthe use of a factors for the variable in question helps overcome thelack of linearity?

No. If done by dividing into samall numbers of categories afterlooking at the data, it merely creates other (and probably moresevere) problems. If you are in the unusal (although desirable)position of having a large number of events across the range of thecovariates in your data, you may be able to cut your variable intoquintiles or deciles and analyze the resulting factor, but thepreferred approach would be to fit a regression spline of sufficientcomplexity.

Thanks in advance.


--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Dummy variables or factors?

Reply via email to