Variable section is part of the training process-- it chooses the model. By definition, test data is used only for testing (evaluating chosen model).
If you find a package or function that does variable selection on test data, run from it! Best, Andy > -----Original Message----- > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of Jin Minming > Sent: Monday, January 30, 2012 8:14 AM > To: r-help@r-project.org > Subject: [R] Variable selection based on both training and > testing data > > Dear all, > > The variable selection in regression is usually determined by > the training data using AIC or F value, such as stepAIC. Is > there some R package that can consider both the training and > test dataset? For example, I have two separate training data > and test data. Firstly, a regression model is obtained by > using training data, and then this model is tested by using > test data. This process continues in order to find some > possible optimal models in terms of RMSE or R2 for both > training and test data. > > Thanks, > > Jim > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > Notice: This e-mail message, together with any attachme...{{dropped:11}} ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.