Thanks for following up promptly and sharing the feedback @shaoxuan. Yes I share the same view with you on the convergence of these 2 FLIPs eventually. I also have some questions regarding the API as well as the possible convergence challenges (especially current Co-processor approach vs. FLIP-39's table API approach), I will follow up on the discussion thread and the PR on FLIP-23 with you and Boris :-)
-- Rong On Mon, May 6, 2019 at 3:30 AM Shaoxuan Wang <wshaox...@gmail.com> wrote: > > Thanks for the feedback, Rong and Flavio. > > @Rong Rong > > There's another thread regarding a close to merge FLIP-23 implementation > > [1]. I agree this might still be early stage to talk about > productionizing > > and model-serving. But I would be nice to keep the design/implementation > in > > mind that: ease of use for productionizing a ML pipeline is also very > > important. > > And if we can leverage the implementation in FLIP-23 in the future, (some > > adjustment might be needed) that would be super helpful. > Your raised a very good point. Actually I have been reviewing FLIP23 for a > while (mostly offline to help Boris polish the PR). FMPOV, FLIP23 and > FLIP39 can be well unified at some point. Model serving in FLIP23 is > actually a special case of “transformer/model” proposed in FLIP39. Boris's > implementation of model serving can be designed as an abstract class on top > of transformer/model interface, and then can be used by ML users as a > certain ML lib. I have some other comments WRT FLIP23 x FLIP39, I will > reply to the FLIP23 ML later with more details. > > @Flavio > > I have read many discussion about Flink ML and none of them take into > > account the ongoing efforts carried out of by the Streamline H2020 > project > > [1] on this topic. > > Have you tried to ping them? I think that both projects could benefits > from > > a joined effort on this side.. > > [1] https://h2020-streamline-project.eu/objectives/ > Thank you for your info. I am not aware of the Streamline H2020 projects > before. Just did a quick look at its website and github. IMO these projects > could be very good Flink ecosystem projects and can be built on top of ML > pipeline & ML lib interfaces introduced in FLIP39. I will try to contact > the owners of these projects to understand their plans and blockers of > using Flink (if there is any). In the meantime, if you have the direct > contact of person who might be interested on ML pipeline & ML lib, please > share with me. > > Regards, > Shaoxuan > > > > > > On Thu, May 2, 2019 at 3:59 PM Flavio Pompermaier <pomperma...@okkam.it> > wrote: > >> Hi to all, >> I have read many discussion about Flink ML and none of them take into >> account the ongoing efforts carried out of by the Streamline H2020 project >> [1] on this topic. >> Have you tried to ping them? I think that both projects could benefits >> from >> a joined effort on this side.. >> [1] https://h2020-streamline-project.eu/objectives/ >> >> Best, >> Flavio >> >> On Thu, May 2, 2019 at 12:18 AM Rong Rong <walter...@gmail.com> wrote: >> >> > Hi Shaoxuan/Weihua, >> > >> > Thanks for the proposal and driving the effort. >> > I also replied to the original discussion thread, and still a +1 on >> moving >> > towards the ski-learn model. >> > I just left a few comments on the API details and some general >> questions. >> > Please kindly take a look. >> > >> > There's another thread regarding a close to merge FLIP-23 implementation >> > [1]. I agree this might still be early stage to talk about >> productionizing >> > and model-serving. But I would be nice to keep the >> design/implementation in >> > mind that: ease of use for productionizing a ML pipeline is also very >> > important. >> > And if we can leverage the implementation in FLIP-23 in the future, >> (some >> > adjustment might be needed) that would be super helpful. >> > >> > Best, >> > Rong >> > >> > >> > [1] >> > >> > >> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-23-Model-Serving-td20260.html >> > >> > >> > On Tue, Apr 30, 2019 at 1:47 AM Shaoxuan Wang <wshaox...@gmail.com> >> wrote: >> > >> > > Thanks for all the feedback. >> > > >> > > @Jincheng Sun >> > > > I recommend It's better to add a detailed implementation plan to >> FLIP >> > and >> > > google doc. >> > > Yes, I will add a subsection for implementation plan. >> > > >> > > @Chen Qin >> > > >Just share some of insights from operating SparkML side at scale >> > > >- map reduce may not best way to iterative sync partitioned workers. >> > > >- native hardware accelerations is key to adopt rapid changes in ML >> > > improvements in foreseeable future. >> > > Thanks for sharing your experience on SparkML. The purpose of this >> FLIP >> > is >> > > mainly to provide the interfaces for ML pipeline and ML lib, and the >> > > implementations of most standard algorithms. Besides this FLIP, for AI >> > > computing on Flink, we will continue to contribute the efforts, like >> the >> > > enhancement of iterative and the integration of deep learning engines >> > (such >> > > as Tensoflow/Pytorch). I have presented part of these work in >> > > >> > > >> > >> https://www.ververica.com/resources/flink-forward-san-francisco-2019/when-table-meets-ai-build-flink-ai-ecosystem-on-table-api >> > > I am not sure if I have fully got your comments. Can you please >> elaborate >> > > them with more details, and if possible, please provide some >> suggestions >> > > about what we should work on to address the challenges you have >> > mentioned. >> > > >> > > Regards, >> > > Shaoxuan >> > > >> > > On Mon, Apr 29, 2019 at 11:28 AM Chen Qin <qinnc...@gmail.com> wrote: >> > > >> > > > Just share some of insights from operating SparkML side at scale >> > > > - map reduce may not best way to iterative sync partitioned workers. >> > > > - native hardware accelerations is key to adopt rapid changes in ML >> > > > improvements in foreseeable future. >> > > > >> > > > Chen >> > > > >> > > > On Apr 29, 2019, at 11:02, jincheng sun <sunjincheng...@gmail.com> >> > > wrote: >> > > > > >> > > > > Hi Shaoxuan, >> > > > > >> > > > > Thanks for doing more efforts for the enhances of the scalability >> and >> > > the >> > > > > ease of use of Flink ML and make it one step further. Thank you >> for >> > > > sharing >> > > > > a lot of context information. >> > > > > >> > > > > big +1 for this proposal! >> > > > > >> > > > > Here only one suggestion, that is, It has been a short time until >> the >> > > > > release of flink-1.9, so I recommend It's better to add a detailed >> > > > > implementation plan to FLIP and google doc. >> > > > > >> > > > > What do you think? >> > > > > >> > > > > Best, >> > > > > Jincheng >> > > > > >> > > > > Shaoxuan Wang <wshaox...@gmail.com> 于2019年4月29日周一 上午10:34写道: >> > > > > >> > > > >> Hi everyone, >> > > > >> >> > > > >> Weihua has proposed to rebuild Flink ML pipeline on top of >> TableAPI >> > > > several >> > > > >> months ago in this mail thread: >> > > > >> >> > > > >> >> > > > >> >> > > > >> > > >> > >> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Embracing-Table-API-in-Flink-ML-td25368.html >> > > > >> >> > > > >> Luogen, Becket, Xu, Weihua and I have been working on this >> proposal >> > > > >> offline in >> > > > >> the past a few months. Now we want to share the first phase of >> the >> > > > entire >> > > > >> proposal with a FLIP. In this FLIP-39, we want to achieve several >> > > things >> > > > >> (and hope those can be accomplished and released in Flink-1.9): >> > > > >> >> > > > >> - >> > > > >> >> > > > >> Provide a new set of ML core interface (on top of Flink >> TableAPI) >> > > > >> - >> > > > >> >> > > > >> Provide a ML pipeline interface (on top of Flink TableAPI) >> > > > >> - >> > > > >> >> > > > >> Provide the interfaces for parameters management and >> pipeline/mode >> > > > >> persistence >> > > > >> - >> > > > >> >> > > > >> All the above interfaces should facilitate any new ML >> algorithm. >> > We >> > > > will >> > > > >> gradually add various standard ML algorithms on top of these >> new >> > > > >> proposed >> > > > >> interfaces to ensure their feasibility and scalability. >> > > > >> >> > > > >> >> > > > >> Part of this FLIP has been present in Flink Forward 2019 @ San >> > > > Francisco by >> > > > >> Xu and Me. >> > > > >> >> > > > >> >> > > > >> >> > > > >> > > >> > >> https://sf-2019.flink-forward.org/conference-program#when-table-meets-ai--build-flink-ai-ecosystem-on-table-api >> > > > >> >> > > > >> >> > > > >> >> > > > >> > > >> > >> https://sf-2019.flink-forward.org/conference-program#high-performance-ml-library-based-on-flink >> > > > >> >> > > > >> You can find the videos & slides at >> > > > >> https://www.ververica.com/flink-forward-san-francisco-2019 >> > > > >> >> > > > >> The design document for FLIP-39 can be found here: >> > > > >> >> > > > >> >> > > > >> >> > > > >> > > >> > >> https://docs.google.com/document/d/1StObo1DLp8iiy0rbukx8kwAJb0BwDZrQrMWub3DzsEo >> > > > >> >> > > > >> >> > > > >> I am looking forward to your feedback. >> > > > >> >> > > > >> Regards, >> > > > >> >> > > > >> Shaoxuan >> > > > >> >> > > > >> > > >> > >> >