Hi all, I have a question on rpart and randomforest results:
We calculated a single regression tree using rpart and got a pseudo-r2 of roundabout 10% (which is not too bad compared to a linear regression on this data). Encouraged by this we grew a whole regression forest on the same data set using randomforest. But we got pretty bad pseudo-r2 values for the randomforest (even sometimes negative values for some option settings). We then thought that if we built only one single tree with the randomforest routine we should get a result similar to that of rpart. So we set the options for randomforest to only one single tree but the resulting pseudo-r2 value was negative aswell. Does anyone have a clue as to why the randomforest results are so bad whereas the rpart result is quite ok? Is our assumption that a single tree grown by randomforest should give similar results as a tree grown by rpart wrong? What am I missing here? Thanks a lot for your help! Sonja ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.