Hello Flinkers, I am building a *streaming* prototype system on top of Flink and I want ideally to enable ML training (if possible DL) in Flink. It would be nice to lay down all the existing libraries that provide primitives that enable the training of ML models.
I assume it is more efficient to do all the training in Flink (somehow) rather than (re)training a model in Tensorflow (or Pytorch) and porting it to a flink Job. For instance, https://stackoverflow.com/questions/59563265/embedd-existing-ml-model-in-apache-flink Especially, in streaming ML systems the training and the serving should both happen in an online fashion. To initialize the pool, I have found the following options that run on top of Flink i.e., leveraging the engine for distributed and scalable ML training. 1) *FlinkML(DataSet API)* https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/libs/ml/index.html This is not for streaming ML as it shits on top of DataSet API. In addition, recently the library is dropped https://stackoverflow.com/questions/58752787/what-is-the-status-of-flinkml but there is ongoing development (??) of a new library on top of TableAPI. https://cwiki.apache.org/confluence/display/FLINK/FLIP-39+Flink+ML+pipeline+and+ML+libs https://issues.apache.org/jira/browse/FLINK-12470 which is not in the 1.10 distribution. 2) *Apache Mahout* https://mahout.apache.org/ I thought it was long dead, but recently they started developing it again. 3) *Apache SAMOA* https://samoa.incubator.apache.org/ They are developing it, but slowly. It is an incubator project since 2013. 4) *FlinkML Organization* https://github.com/FlinkML This one has repos that are interesting e.g. the flink-jpmml https://github.com/FlinkML/flink-jpmml and an implementation of a parameter server https://github.com/FlinkML/flink-parameter-server , which is really usefull when for enabling distributed training in a sense that the model is also distributed during training. Though, the repo(s) are not really active. 5) *DeepLearning4j *https://deeplearning4j.org/ This is a distributed, deep learning library that it was said to work also on top of Flink (here http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-support-for-DeepLearning4j-or-other-deep-learning-library-td12157.html) I am not interested at all in GPU support but I am wondering is anyone had succesfully used this one on top of Flink. 6) *Proteus - SOLMA* https://github.com/proteus-h2020/proteus-solma It is a scalable online learning library on top of Flink, and is the output of a H2020 research project called PROTEUS. http://www.bdva.eu/sites/default/files/hbouchachia_sacbd-ecsa18.pdf 7) *Alibaba - ALink* https://github.com/alibaba/Alink/blob/master/README.en-US.md A machine learning algorithm platform from Alibaba which is actively maintained. These are all the available systems that I have found ML using Flink's engine. *Questions* (i) Has anyone used them? (ii) More specifically, has someone implemented *Stochastic Gradient Descent, Skip-gram models, Autoencoders* with any of the above tools (or other)? *Remarks* If you have any experiences/comments/additions to share please do it! Gotta Catch 'Em All! <https://www.youtube.com/watch?v=MpaHR-V_R-o> Best, Max -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/