There are many ways to measure prediction quality, and what you choose depends on the data and your goals. A common measure for a quantitative response is mean squared error (i.e. 1/n * sum((observed - predicted)^2)) which incorporates bias and variance. Common terms for what you are looking for are "test error" and "generalization error".
hth, Kingsford On Wed, Sep 2, 2009 at 11:56 PM, Corrado<ct...@york.ac.uk> wrote: > Dear R-friends, > > How do you test the goodness of prediction of a model, when you predict on a > set of data DIFFERENT from the training set? > > I explain myself: you train your model M (e.g. glm,gam,regression tree, brt) > on a set of data A with a response variable Y. You then predict the value of > that same response variable Y on a different set of data B (e.g. predict.glm, > predict.gam and so on). Dataset A and dataset B are different in the sense > that > they contain the same variable, for example temperature, measured in different > sites, or on a different interval (e.g. B is a subinterval of A for > interpolation, or a different interval for extrapolation). If you have the > measured values for Y on the new interval, i.e. B, how do you measure how good > is the prediction, that is how well model fits the Y on B (that is, how well > does it predict)? > > In other words: > > Y~T,data=A for training > Y~T,data=B for predicting > > I have devised a couple of method based around 1) standard deviation 2) R^2, > but I am unhappy with them. > > Regards > -- > Corrado Topi > > Global Climate Change & Biodiversity Indicators > Area 18,Department of Biology > University of York, York, YO10 5YW, UK > Phone: + 44 (0) 1904 328645, E-mail: ct...@york.ac.uk > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.