Re: [R] Prediction accuracy from Bagging with continuous data

Peter Langfelder Thu, 10 Feb 2011 16:52:13 -0800

On Thu, Feb 10, 2011 at 8:45 AM, Simon Gillings <simon.gilli...@bto.org> wrote:
> I am using bagging to perform Bagged Regression Trees on count data (bird 
> abundance in Britain and Ireland, in relation to climate and land cover 
> variables). Predictions from the final model are visually believable but I 
> would really like a diagnostic equivalent to classification success that can 
> be used to decide if a model is adequate. Whereas with classification data an 
> error rate is returned, with continuous data only the root mean squared error 
> is returned. The RMSE is helpful for comparing different models for the same 
> species and deciding which is best, but as far as I can tell it offers no 
> absolute measure of how good that best model is.
>
> At present I am using the final model to make predictions for the original 
> dataset and then computing a correlation coefficient between observed and 
> predicted values but I expect this is probably biased high due to 
> non-independence. Ideally I think I need the correlation coefficient between 
> the predictions and observed values for the out of bag sample for each of the 
> n trees produced, but I don't see this produced anywhere.
>
> Does anyone know of a means of getting a useful unbiased diagnostic for 
> assessing overall fit?
>


Not sure this suggestion is going to help you, but you could switch to
the Random Forest ensemble of regression trees (package randomForest).
The Random Forest predictor automatically calculates predicted values
from/on out-of-bag samples and hence will give you a source to
calculate an unbiased estimate of accuracy.

Peter

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Prediction accuracy from Bagging with continuous data

Reply via email to