Hi all,
I'm currently using the 'rpart' function to run some regression analysis and I
am at the point where I wish to prune my overfitted trees. Having read the
documentation I understand that to do this requires the use of the complexity
parameter. My question is how to go about choosing the correct complexity
parameter for my tree? In some places
(http://www.statmethods.net/advstats/cart.html) I have read that it is best to
select the complexity parameter which minimises the cross-validated (x) error
of the model, but elsewhere I have read that the optimum cp is the first value
on the left above the '1+SE' line of the complexity paramter plot.
I was hoping someone might be able to clarify this minor issue for me.
Many thanks,
Andy
_________________________________________________________________
Save time by using Hotmail to access your other email accounts.
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.