Jay Unless I have misunderstood some statistical subtleties, you can use the AIC in place of actual cross-validation, as the AIC is asymptotically equivalent to leave-out-one cross-validation under MLE. Joe
Stone, M. An asymptotic equivalence of choice of model by cross-validation and Akaike's criterion Journal of the Royal Statistical Society. Series B (Methodological), 1977, 39, 44-47 Abstract: A logarithmic assessment of the performance of a predicting density is found to lead to asymptotic equivalence of choice of model by cross-validation and Akaike's criterion, when maximum likelihood estimation is used within each model. Jay <josip.2...@gmail.com> Sent by: r-help-boun...@r-project.org 04/02/2010 09:14 AM To r-help@r-project.org cc Subject [R] Cross-validation for parameter selection (glm/logit) If my aim is to select a good subset of parameters for my final logit model built using glm(). What is the best way to cross-validate the results so that they are reliable? Let's say that I have a large dataset of 1000's of observations. I split this data into two groups, one that I use for training and another for validation. First I use the training set to build a model, and the the stepAIC() with a Forward-Backward search. BUT, if I base my parameter selection purely on this result, I suppose it will be somewhat skewed due to the 1-time data split (I use only 1 training dataset) What is the correct way to perform this variable selection? And are the readily available packages for this? Similarly, when I have my final parameter set, how should I go about and make the final assessment of the models predictability? CV? What package? Thank you in advance, Jay ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.