RE: Feature selection interface

2014-07-18 Thread Ulanov, Alexander
FYI This is my first take on feature selection, filtering and chi-squared: https://github.com/apache/spark/pull/1484 -Original Message- From: Ulanov, Alexander Sent: Thursday, July 10, 2014 9:39 PM To: dev@spark.apache.org Subject: Feature selection interface Hi, I've implemen

Feature selection interface

2014-07-10 Thread Ulanov, Alexander
Hi, I've implemented a class that does Chi-squared feature selection for RDD[LabeledPoint]. It also computes basic class/feature occurrence statistics and other methods like mutual information or information gain can be easily implemented. I would like to make a pull request. However, MLlib mas