Robin Williams wrote <<<< Is there any facility in R to perform a stepwise process on a model, which will remove any highly-correlated explanatory variables? I am told there is in SPSS. I have a large number of variables (some correlated), which I would like to just chuck in to a model and perform stepwise and see what comes out the other end, to give me an idea perhaps as to which variables I should focus on. Thanks for any help / suggestions. >>>
Stepwise is a bad method of selecting variables. Far better methods are LASSO and LAR (least angle regression), available in the LARS package and the LASSO2 package. However, while both these methods are good, neither is a substitute for substantive knowledge. Also, the key thing is not so much whether variables are correlated, but whether they are co-linear, which is different. If you have a great many variables, then you can have a high degree of colinearity even with no high pairwise correlations. I've not done this in R, but RSiteSearch("collinearity", restrict = 'functions') yields 34 hits. HTH Peter [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.