> From: r-help-boun...@r-project.org 
> [mailto:r-help-boun...@r-project.org] On Behalf Of Saruman
> 
> I dont see how this answered the original question of the poster.
> 
> He was quite clear: the value of the predictions coming out 
> of RF do not
> match what comes out of the predict function using the same 
> RF object and
> the same data. Therefore, what is predict() doing that is 
> different from RF?
> Yes, RF is making its predictions using OOB, but nowhere does 
> it say way
> predict() is doing; indeed, it says if newdata is not given, then the
> results are just the OOB predictions. But newdata=oldata, then
> predict(newdata) != OOB predictions. So what is it then? 

Let me make this as clear as I possibly can:  If predict() is called without 
newdata, all it can do is assume prediction on the training set is desired.  In 
that case it returns the OOB prediction.  If newdata is given in predict(), it 
assumes it is "new" data and thus makes prediction using all trees.  If you 
just feed the training data as newdata, then yes, you will get overfitted 
predictions.  It almost never make sense (to me anyway) to make predictions on 
the training set.
 
> Opens another issue, which is if newdata is close but not 
> exactly oldata,
> then you get overfitted results?

Possibly, depending on how "close" the new data are to the training set.  This 
applies to nearly _ALL_ methods, not just RF.

Andy
 
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Question-about-randomForest-tp41
11311p4529770.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
Notice:  This e-mail message, together with any attachme...{{dropped:11}}

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to