Re: Evaluation Metrics for Spark's MLlib

2014-12-11 Thread Joseph Bradley
Hi, I'd recommend starting by checking out the existing helper functionality for these tasks. There are helper methods to do K-fold cross-validation in MLUtils: https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/mllib/util/MLUtils.scala The experimental spark.ml API

Evaluation Metrics for Spark's MLlib

2014-12-11 Thread kidynamit
Hi, I would like to contribute to Spark's Machine Learning library by adding evaluation metrics that would be used to gauge the accuracy of a model given a certain features' set. In particular, I seek to contribute the k-fold validation metrics, f-beta metric among others on top of the current ML