Hello everyone,

I have been doing a binary classification using random forest from the
library "randomForest". One of the predictors is a factor variable, which is
known to be highly related to the binary response I am trying to predict.
Other 80 predictors are numeric. Totally I have 44 subjects. However, the
random forest returns the factor variable as the least important one based
on the decreased accuracy measurement.  I specified the classification in
the way below:

myrf <- randomForest (disease ~ ., data = mydata, ntree = 500,  importance =
T, do.trace = F, keep.forest = T)

disease is factor of "cases" and "controls". Are there any specific things I
need to do differently when a factor predictor is included?

Thank you very much,

Jing

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to