Hi Till,

Thanks a lot for your suggestion. It's a good idea to offer the flink-ml
libraries as optional dependencies on the download page which can make the
dist smaller.

But I also have some concerns for it, e.g., the download page now only
includes the latest 3 releases. We may need to find ways to support more
versions.
On the other hand, the size of the flink-ml libraries now is very
small(about 246K), so it would not bring much impact on the size of dist.

What do you think?

Best,
Hequn

On Mon, Feb 3, 2020 at 6:24 PM Till Rohrmann <trohrm...@apache.org> wrote:

> An alternative solution would be to offer the flink-ml libraries as
> optional dependencies on the download page. Similar to how we offer the
> different SQL formats and Hadoop releases [1].
>
> [1] https://flink.apache.org/downloads.html
>
> Cheers,
> Till
>
> On Mon, Feb 3, 2020 at 10:19 AM Hequn Cheng <he...@apache.org> wrote:
>
> > Thank you all for your feedback and suggestions!
> >
> > Best, Hequn
> >
> > On Mon, Feb 3, 2020 at 5:07 PM Becket Qin <becket....@gmail.com> wrote:
> >
> > > Thanks for bringing up the discussion, Hequn.
> > >
> > > +1 on adding `flink-ml-api` and `flink-ml-lib` into opt. This would
> make
> > > it much easier for the users to try out some simple ml tasks.
> > >
> > > Thanks,
> > >
> > > Jiangjie (Becket) Qin
> > >
> > > On Mon, Feb 3, 2020 at 4:34 PM jincheng sun <sunjincheng...@gmail.com>
> > > wrote:
> > >
> > >> Thank you for pushing forward @Hequn Cheng <he...@apache.org> !
> > >>
> > >> Hi  @Becket Qin <becket....@gmail.com> , Do you have any concerns on
> > >> this ?
> > >>
> > >> Best,
> > >> Jincheng
> > >>
> > >> Hequn Cheng <he...@apache.org> 于2020年2月3日周一 下午2:09写道:
> > >>
> > >>> Hi everyone,
> > >>>
> > >>> Thanks for the feedback. As there are no objections, I've opened a
> JIRA
> > >>> issue(FLINK-15847[1]) to address this issue.
> > >>> The implementation details can be discussed in the issue or in the
> > >>> following PR.
> > >>>
> > >>> Best,
> > >>> Hequn
> > >>>
> > >>> [1] https://issues.apache.org/jira/browse/FLINK-15847
> > >>>
> > >>> On Wed, Jan 8, 2020 at 9:15 PM Hequn Cheng <chenghe...@gmail.com>
> > wrote:
> > >>>
> > >>> > Hi Jincheng,
> > >>> >
> > >>> > Thanks a lot for your feedback!
> > >>> > Yes, I agree with you. There are cases that multi jars need to be
> > >>> > uploaded. I will prepare another discussion later. Maybe with a
> > simple
> > >>> > design doc.
> > >>> >
> > >>> > Best, Hequn
> > >>> >
> > >>> > On Wed, Jan 8, 2020 at 3:06 PM jincheng sun <
> > sunjincheng...@gmail.com>
> > >>> > wrote:
> > >>> >
> > >>> >> Thanks for bring up this discussion Hequn!
> > >>> >>
> > >>> >> +1 for include `flink-ml-api` and `flink-ml-lib` in opt.
> > >>> >>
> > >>> >> BTW: I think would be great if bring up a discussion for upload
> > >>> multiple
> > >>> >> Jars at the same time. as PyFlink JOB also can have the benefit if
> > we
> > >>> do
> > >>> >> that improvement.
> > >>> >>
> > >>> >> Best,
> > >>> >> Jincheng
> > >>> >>
> > >>> >>
> > >>> >> Hequn Cheng <chenghe...@gmail.com> 于2020年1月8日周三 上午11:50写道:
> > >>> >>
> > >>> >> > Hi everyone,
> > >>> >> >
> > >>> >> > FLIP-39[1] rebuilds Flink ML pipeline on top of TableAPI which
> > moves
> > >>> >> Flink
> > >>> >> > ML a step further. Base on it, users can develop their ML jobs
> and
> > >>> more
> > >>> >> and
> > >>> >> > more machine learning platforms are providing ML services.
> > >>> >> >
> > >>> >> > However, the problem now is the jars of flink-ml-api and
> > >>> flink-ml-lib
> > >>> >> are
> > >>> >> > only exist on maven repo. Whenever users want to submit ML jobs,
> > >>> they
> > >>> >> can
> > >>> >> > only depend on the ml modules and package a fat jar. This would
> be
> > >>> >> > inconvenient especially for the machine learning platforms on
> > which
> > >>> >> nearly
> > >>> >> > all jobs depend on Flink ML modules and have to package a fat
> jar.
> > >>> >> >
> > >>> >> > Given this, it would be better to include jars of flink-ml-api
> and
> > >>> >> > flink-ml-lib in the `opt` folder, so that users can directly use
> > the
> > >>> >> jars
> > >>> >> > with the binary release. For example, users can move the jars
> into
> > >>> the
> > >>> >> > `lib` folder or use -j to upload the jars. (Currently, -j only
> > >>> support
> > >>> >> > upload one jar. Supporting multi jars for -j can be discussed in
> > >>> another
> > >>> >> > discussion.)
> > >>> >> >
> > >>> >> > Putting the jars in the `opt` folder instead of the `lib` folder
> > is
> > >>> >> because
> > >>> >> > currently, the ml jars are still optional for the Flink project
> by
> > >>> >> default.
> > >>> >> >
> > >>> >> > What do you think? Welcome any feedback!
> > >>> >> >
> > >>> >> > Best,
> > >>> >> >
> > >>> >> > Hequn
> > >>> >> >
> > >>> >> > [1]
> > >>> >> >
> > >>> >> >
> > >>> >>
> > >>>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-39+Flink+ML+pipeline+and+ML+libs
> > >>> >> >
> > >>> >>
> > >>> >
> > >>>
> > >>
> >
>

Reply via email to