Hi to all, I have read many discussion about Flink ML and none of them take into account the ongoing efforts carried out of by the Streamline H2020 project [1] on this topic. Have you tried to ping them? I think that both projects could benefits from a joined effort on this side.. [1] https://h2020-streamline-project.eu/objectives/
Best, Flavio On Thu, May 2, 2019 at 12:18 AM Rong Rong <walter...@gmail.com> wrote: > Hi Shaoxuan/Weihua, > > Thanks for the proposal and driving the effort. > I also replied to the original discussion thread, and still a +1 on moving > towards the ski-learn model. > I just left a few comments on the API details and some general questions. > Please kindly take a look. > > There's another thread regarding a close to merge FLIP-23 implementation > [1]. I agree this might still be early stage to talk about productionizing > and model-serving. But I would be nice to keep the design/implementation in > mind that: ease of use for productionizing a ML pipeline is also very > important. > And if we can leverage the implementation in FLIP-23 in the future, (some > adjustment might be needed) that would be super helpful. > > Best, > Rong > > > [1] > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-23-Model-Serving-td20260.html > > > On Tue, Apr 30, 2019 at 1:47 AM Shaoxuan Wang <wshaox...@gmail.com> wrote: > > > Thanks for all the feedback. > > > > @Jincheng Sun > > > I recommend It's better to add a detailed implementation plan to FLIP > and > > google doc. > > Yes, I will add a subsection for implementation plan. > > > > @Chen Qin > > >Just share some of insights from operating SparkML side at scale > > >- map reduce may not best way to iterative sync partitioned workers. > > >- native hardware accelerations is key to adopt rapid changes in ML > > improvements in foreseeable future. > > Thanks for sharing your experience on SparkML. The purpose of this FLIP > is > > mainly to provide the interfaces for ML pipeline and ML lib, and the > > implementations of most standard algorithms. Besides this FLIP, for AI > > computing on Flink, we will continue to contribute the efforts, like the > > enhancement of iterative and the integration of deep learning engines > (such > > as Tensoflow/Pytorch). I have presented part of these work in > > > > > https://www.ververica.com/resources/flink-forward-san-francisco-2019/when-table-meets-ai-build-flink-ai-ecosystem-on-table-api > > I am not sure if I have fully got your comments. Can you please elaborate > > them with more details, and if possible, please provide some suggestions > > about what we should work on to address the challenges you have > mentioned. > > > > Regards, > > Shaoxuan > > > > On Mon, Apr 29, 2019 at 11:28 AM Chen Qin <qinnc...@gmail.com> wrote: > > > > > Just share some of insights from operating SparkML side at scale > > > - map reduce may not best way to iterative sync partitioned workers. > > > - native hardware accelerations is key to adopt rapid changes in ML > > > improvements in foreseeable future. > > > > > > Chen > > > > > > On Apr 29, 2019, at 11:02, jincheng sun <sunjincheng...@gmail.com> > > wrote: > > > > > > > > Hi Shaoxuan, > > > > > > > > Thanks for doing more efforts for the enhances of the scalability and > > the > > > > ease of use of Flink ML and make it one step further. Thank you for > > > sharing > > > > a lot of context information. > > > > > > > > big +1 for this proposal! > > > > > > > > Here only one suggestion, that is, It has been a short time until the > > > > release of flink-1.9, so I recommend It's better to add a detailed > > > > implementation plan to FLIP and google doc. > > > > > > > > What do you think? > > > > > > > > Best, > > > > Jincheng > > > > > > > > Shaoxuan Wang <wshaox...@gmail.com> 于2019年4月29日周一 上午10:34写道: > > > > > > > >> Hi everyone, > > > >> > > > >> Weihua has proposed to rebuild Flink ML pipeline on top of TableAPI > > > several > > > >> months ago in this mail thread: > > > >> > > > >> > > > >> > > > > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Embracing-Table-API-in-Flink-ML-td25368.html > > > >> > > > >> Luogen, Becket, Xu, Weihua and I have been working on this proposal > > > >> offline in > > > >> the past a few months. Now we want to share the first phase of the > > > entire > > > >> proposal with a FLIP. In this FLIP-39, we want to achieve several > > things > > > >> (and hope those can be accomplished and released in Flink-1.9): > > > >> > > > >> - > > > >> > > > >> Provide a new set of ML core interface (on top of Flink TableAPI) > > > >> - > > > >> > > > >> Provide a ML pipeline interface (on top of Flink TableAPI) > > > >> - > > > >> > > > >> Provide the interfaces for parameters management and pipeline/mode > > > >> persistence > > > >> - > > > >> > > > >> All the above interfaces should facilitate any new ML algorithm. > We > > > will > > > >> gradually add various standard ML algorithms on top of these new > > > >> proposed > > > >> interfaces to ensure their feasibility and scalability. > > > >> > > > >> > > > >> Part of this FLIP has been present in Flink Forward 2019 @ San > > > Francisco by > > > >> Xu and Me. > > > >> > > > >> > > > >> > > > > > > https://sf-2019.flink-forward.org/conference-program#when-table-meets-ai--build-flink-ai-ecosystem-on-table-api > > > >> > > > >> > > > >> > > > > > > https://sf-2019.flink-forward.org/conference-program#high-performance-ml-library-based-on-flink > > > >> > > > >> You can find the videos & slides at > > > >> https://www.ververica.com/flink-forward-san-francisco-2019 > > > >> > > > >> The design document for FLIP-39 can be found here: > > > >> > > > >> > > > >> > > > > > > https://docs.google.com/document/d/1StObo1DLp8iiy0rbukx8kwAJb0BwDZrQrMWub3DzsEo > > > >> > > > >> > > > >> I am looking forward to your feedback. > > > >> > > > >> Regards, > > > >> > > > >> Shaoxuan > > > >> > > > > > >