Fokko Driesprong created FLINK-5426: ---------------------------------------
Summary: Clean up the Flink Machine Learning library Key: FLINK-5426 URL: https://issues.apache.org/jira/browse/FLINK-5426 Project: Flink Issue Type: Improvement Components: Machine Learning Library Reporter: Fokko Driesprong Hi Guys, I would like to clean up the Machine Learning library. A lot of the code in the ML Library does not conform to the original contribution guide. For example: Duplicate tests, different names, but exactly the same testcase: https://github.com/apache/flink/blob/master/flink-libraries/flink-ml/src/test/scala/org/apache/flink/ml/math/DenseVectorSuite.scala#L148 https://github.com/apache/flink/blob/master/flink-libraries/flink-ml/src/test/scala/org/apache/flink/ml/math/DenseVectorSuite.scala#L164 Lot of multi-line tests-cases: https://github.com/Fokko/flink/blob/master/flink-libraries/flink-ml/src/test/scala/org/apache/flink/ml/math/DenseVectorSuite.scala Mis-use of constants: https://github.com/apache/flink/blob/master/flink-libraries/flink-ml/src/main/scala/org/apache/flink/ml/math/DenseMatrix.scala#L58 Please allow me to clean this up, and I'm looking forward to contribute more code, especially to the ML part. I've have been a contributor to Apache Spark and am happy to extend the codebase with new distributed algorithms and make the codebase more mature. Cheers, Fokko -- This message was sent by Atlassian JIRA (v6.3.4#6332)