> Incidentally, there is nothing new or radical in this; indeed, John Tukey, > Leo Breiman, George Box, and others wrote eloquently about this decades ago. > And Breiman's random forest modeling procedure explicitly abandoned efforts > to build simply interpretable models (from which one might infer causality) > in favor of building better interpolators, although assessment of "variable > importance" does try to recover some of that interpretability (however, no > guarantees are given).
I've found the making distinction between models for explanation and models for prediction to be particularly helpful. I was first made aware of this split by Brian Ripley's talk "Selecting amongst large classes of models", presented at a symposium in honour of John Nelder's 80th birthday - http://www.stats.ox.ac.uk/~ripley/Nelder80.pdf Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.