Re: [DISCUSS] FLIP-39: Flink ML pipeline and ML libs

Flavio Pompermaier Thu, 02 May 2019 00:59:33 -0700

Hi to all,
I have read many discussion about Flink ML and none of them take into
account the ongoing efforts carried out of by the Streamline H2020 project
[1] on this topic.
Have you tried to ping them? I think that both projects could benefits from
a joined effort on this side..
[1] https://h2020-streamline-project.eu/objectives/


Best,
Flavio

On Thu, May 2, 2019 at 12:18 AM Rong Rong <[email protected]> wrote:

> Hi Shaoxuan/Weihua,
>
> Thanks for the proposal and driving the effort.
> I also replied to the original discussion thread, and still a +1 on moving
> towards the ski-learn model.
> I just left a few comments on the API details and some general questions.
> Please kindly take a look.
>
> There's another thread regarding a close to merge FLIP-23 implementation
> [1]. I agree this might still be early stage to talk about productionizing
> and model-serving. But I would be nice to keep the design/implementation in
> mind that: ease of use for productionizing a ML pipeline is also very
> important.
> And if we can leverage the implementation in FLIP-23 in the future, (some
> adjustment might be needed) that would be super helpful.
>
> Best,
> Rong
>
>
> [1]
>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-23-Model-Serving-td20260.html
>
>
> On Tue, Apr 30, 2019 at 1:47 AM Shaoxuan Wang <[email protected]> wrote:
>
> > Thanks for all the feedback.
> >
> > @Jincheng Sun
> > > I recommend It's better to add a detailed implementation plan to FLIP
> and
> > google doc.
> > Yes, I will add a subsection for implementation plan.
> >
> > @Chen Qin
> > >Just share some of insights from operating SparkML side at scale
> > >- map reduce may not best way to iterative sync partitioned workers.
> > >- native hardware accelerations is key to adopt rapid changes in ML
> > improvements in foreseeable future.
> > Thanks for sharing your experience on SparkML. The purpose of this FLIP
> is
> > mainly to provide the interfaces for ML pipeline and ML lib, and the
> > implementations of most standard algorithms. Besides this FLIP, for AI
> > computing on Flink, we will continue to contribute the efforts, like the
> > enhancement of iterative and the integration of deep learning engines
> (such
> > as Tensoflow/Pytorch). I have presented part of these work in
> >
> >
> https://www.ververica.com/resources/flink-forward-san-francisco-2019/when-table-meets-ai-build-flink-ai-ecosystem-on-table-api
> > I am not sure if I have fully got your comments. Can you please elaborate
> > them with more details, and if possible, please provide some suggestions
> > about what we should work on to address the challenges you have
> mentioned.
> >
> > Regards,
> > Shaoxuan
> >
> > On Mon, Apr 29, 2019 at 11:28 AM Chen Qin <[email protected]> wrote:
> >
> > > Just share some of insights from operating SparkML side at scale
> > > - map reduce may not best way to iterative sync partitioned workers.
> > > - native hardware accelerations is key to adopt rapid changes in ML
> > > improvements in foreseeable future.
> > >
> > > Chen
> > >
> > > On Apr 29, 2019, at 11:02, jincheng sun <[email protected]>
> > wrote:
> > > >
> > > > Hi Shaoxuan,
> > > >
> > > > Thanks for doing more efforts for the enhances of the scalability and
> > the
> > > > ease of use of Flink ML and make it one step further. Thank you for
> > > sharing
> > > > a lot of context information.
> > > >
> > > > big +1 for this proposal!
> > > >
> > > > Here only one suggestion, that is, It has been a short time until the
> > > > release of flink-1.9, so I recommend It's better to add a detailed
> > > > implementation plan to FLIP and google doc.
> > > >
> > > > What do you think?
> > > >
> > > > Best,
> > > > Jincheng
> > > >
> > > > Shaoxuan Wang <[email protected]> 于2019年4月29日周一 上午10:34写道：
> > > >
> > > >> Hi everyone,
> > > >>
> > > >> Weihua has proposed to rebuild Flink ML pipeline on top of TableAPI
> > > several
> > > >> months ago in this mail thread:
> > > >>
> > > >>
> > > >>
> > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Embracing-Table-API-in-Flink-ML-td25368.html
> > > >>
> > > >> Luogen, Becket, Xu, Weihua and I have been working on this proposal
> > > >> offline in
> > > >> the past a few months. Now we want to share the first phase of the
> > > entire
> > > >> proposal with a FLIP. In this FLIP-39, we want to achieve several
> > things
> > > >> (and hope those can be accomplished and released in Flink-1.9):
> > > >>
> > > >>   -
> > > >>
> > > >>   Provide a new set of ML core interface (on top of Flink TableAPI)
> > > >>   -
> > > >>
> > > >>   Provide a ML pipeline interface (on top of Flink TableAPI)
> > > >>   -
> > > >>
> > > >>   Provide the interfaces for parameters management and pipeline/mode
> > > >>   persistence
> > > >>   -
> > > >>
> > > >>   All the above interfaces should facilitate any new ML algorithm.
> We
> > > will
> > > >>   gradually add various standard ML algorithms on top of these new
> > > >> proposed
> > > >>   interfaces to ensure their feasibility and scalability.
> > > >>
> > > >>
> > > >> Part of this FLIP has been present in Flink Forward 2019 @ San
> > > Francisco by
> > > >> Xu and Me.
> > > >>
> > > >>
> > > >>
> > >
> >
> https://sf-2019.flink-forward.org/conference-program#when-table-meets-ai--build-flink-ai-ecosystem-on-table-api
> > > >>
> > > >>
> > > >>
> > >
> >
> https://sf-2019.flink-forward.org/conference-program#high-performance-ml-library-based-on-flink
> > > >>
> > > >> You can find the videos & slides at
> > > >> https://www.ververica.com/flink-forward-san-francisco-2019
> > > >>
> > > >> The design document for FLIP-39 can be found here:
> > > >>
> > > >>
> > > >>
> > >
> >
> https://docs.google.com/document/d/1StObo1DLp8iiy0rbukx8kwAJb0BwDZrQrMWub3DzsEo
> > > >>
> > > >>
> > > >> I am looking forward to your feedback.
> > > >>
> > > >> Regards,
> > > >>
> > > >> Shaoxuan
> > > >>
> > >
> >
>

Re: [DISCUSS] FLIP-39: Flink ML pipeline and ML libs

Reply via email to