Hello all I've been some time developing a library for data preprocessing in Flink.
I reach out to you because this library is almost finished and this month I will be submitting a paper to a journal (pre-print available at arxiv: https://arxiv.org/abs/1810.06021) I've checked Flink's roadmap ( https://cwiki.apache.org/confluence/display/FLINK/FlinkML%3A+Vision+and+Roadmap) and saw you want to implement Dimensionality reduction. My library has six preprocessing algorithms, three Discretizers and three feature selection methods. I was wondering if there is any possibility to integrate them into Flink. Also, I will be willing to make any necessary changes to the algorithms, if you consider I could implemented in more efficient ways. This will allow me also to improve my knowledge and skill with Flink. The code is at https://github.com/elbaulp/DPASF Hoping to hear from you soon, best regards. *-- Alejandro Alcalde - elbauldelprogramador.com <http://elbauldelprogramador.com>*