I've been using the R package randomForest but there is an aspect I cannot work out the meaning of. After calling the randomForest function, the returned object contains an element called prediction, which is the prediction obtained using all the trees (at least that's my understanding). I've checked that this prediction set has the error rate as reported by err.rate.
However, if I send the training data back into the the predict.randomForest function I find I get a different result to the stored set of predictions. This is true for both classification and regression. I find the predictions obtained this way also have a much lower error rate and perform very well (suspiciously well...) on measures such as AUC. My understanding is that the two predictions above should be the same. Since they are not, I must be not understanding something properly. Any ideas what's going on? ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.