Hi, I'd recommend starting by checking out the existing helper
functionality for these tasks. There are helper methods to do K-fold
cross-validation in MLUtils:
https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/mllib/util/MLUtils.scala
The experimental spark.ml API
Hi,
I would like to contribute to Spark's Machine Learning library by adding
evaluation metrics that would be used to gauge the accuracy of a model given
a certain features' set. In particular, I seek to contribute the k-fold
validation metrics, f-beta metric among others on top of the current ML