FYI This is my first take on feature selection, filtering and chi-squared:
https://github.com/apache/spark/pull/1484
-Original Message-
From: Ulanov, Alexander
Sent: Thursday, July 10, 2014 9:39 PM
To: dev@spark.apache.org
Subject: Feature selection interface
Hi,
I've implemen
Hi,
I've implemented a class that does Chi-squared feature selection for
RDD[LabeledPoint]. It also computes basic class/feature occurrence statistics
and other methods like mutual information or information gain can be easily
implemented. I would like to make a pull request. However, MLlib mas