Hello everyone, I have been doing a binary classification using random forest from the library "randomForest". One of the predictors is a factor variable, which is known to be highly related to the binary response I am trying to predict. Other 80 predictors are numeric. Totally I have 44 subjects. However, the random forest returns the factor variable as the least important one based on the decreased accuracy measurement. I specified the classification in the way below:
myrf <- randomForest (disease ~ ., data = mydata, ntree = 500, importance = T, do.trace = F, keep.forest = T) disease is factor of "cases" and "controls". Are there any specific things I need to do differently when a factor predictor is included? Thank you very much, Jing [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.