Hi, a colleague ran a stepwise discriminant analysis twice in a row and got different results, suggesting some "sochasticity" in the algorithms involved. I looked at her data and found that there was a lot of collinearity, so that I reckoned that maybe "stepclass" (klaR) cannot find a clear winner when trying to include a new variable and makes a random choice. Is that true? another possibility is that "lda" (from MASS) computes CV classification rates from a random subsample instead of using all the data (?) That might be a sensible choice with a very large sample. I advised her to run the function several times and see if a consensus emerges, but that doesn't seem to be the case, and besides, I would like to know what really is going on.
thanks Eric Elguero Laboratory Genetics and Evolution of Infectious Diseases, Team: Genetics and Adaptation of Plasmodium UMR 2724 CNRS-IRD, IRD Montpellier, 911 Avenue Agropolis, BP 64501, 34394 Montpellier Cedex 5, France > f4.U.spDA <- stepclass(f.mes, f.gp4, "lda",improvement=0.01,prior=rep(0.25,4)) `stepwise classification', using 10-fold cross-validated correctness rate of method lda'. 89 observations of 31 variables in 4 classes; direction: both stop criterion: improvement less than 1%. correctness rate: 0.58333; in: "X2"; variables (1): X2 correctness rate: 0.66389; in: "X9"; variables (2): X2, X9 correctness rate: 0.69583; in: "X27"; variables (3): X2, X9, X27 hr.elapsed min.elapsed sec.elapsed 0.00 0.00 20.77 > f4.U.spDA <- stepclass(f.mes, f.gp4, "lda",improvement=0.01,prior=rep(0.25,4)) `stepwise classification', using 10-fold cross-validated correctness rate of method lda'. 89 observations of 31 variables in 4 classes; direction: both stop criterion: improvement less than 1%. correctness rate: 0.60556; in: "X2"; variables (1): X2 correctness rate: 0.71806; in: "X6"; variables (2): X2, X6 hr.elapsed min.elapsed sec.elapsed 0.00 0.00 15.14 ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.