Hello All, I am trying to carry out variable reduction. I do not have information about the dependent variable, and have only the X variables as it were. In selecting variables I wish to keep, I have considered the following criteria. 1) Percentage of missing value in each column/variable 2) Variance of each variable, with a cut-off value.
I recently came across Weka and found that there is an RWeka package which would allow me to make use of Weka through R. Weka provides a "Genetic search" variable reduction method, but I could not find its R code implementation in the RWeka Pdf file on CRAN. I looked for other R packages that allow me to do variable reduction without considering a dependent variable. I came across 'dprep' package but it does not have a Windows implementation. Moreover, I have a dataset that contains continuous and categorical variables, some categorical variables having 3 levels, 10 levels and so on, till a max 50 levels (E.g. States in the USA). Any suggestions in this regard will be much appreciated. Thank you Harsh Singhal Decision Systems, Mu Sigma, Inc. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.