Hello, I am just not sure what the predict.RandomForest function is doing... I confused.
I would expect the predictions for these 2 function calls to predict the same: ```{r} diachp.rf <- randomForest(quality~.,data=data,ntree=50, importance=TRUE) ypred_oob <- predict(diachp.rf) dataX <- data %>% select(-quality) # remove response. ypred <- predict( diachp.rf, dataX ) ypred_oob == ypred ``` These are both out of bag predictions but ypred and ypred_oob are actually they are very different. > table(ypred_oob , data$quality) ypred_oob 0 1 0 1324 346 1 493 2837 > table(ypred , data$quality) ypred 0 1 0 1817 0 1 0 3183 What I find even more disturbing is that 100% accuracy for ypred. Would you agree that this is rather unexpected? regards Witek -- Witold Eryk Wolski ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.