Hi Till, Thanks a lot for your suggestion. It's a good idea to offer the flink-ml libraries as optional dependencies on the download page which can make the dist smaller.
But I also have some concerns for it, e.g., the download page now only includes the latest 3 releases. We may need to find ways to support more versions. On the other hand, the size of the flink-ml libraries now is very small(about 246K), so it would not bring much impact on the size of dist. What do you think? Best, Hequn On Mon, Feb 3, 2020 at 6:24 PM Till Rohrmann <trohrm...@apache.org> wrote: > An alternative solution would be to offer the flink-ml libraries as > optional dependencies on the download page. Similar to how we offer the > different SQL formats and Hadoop releases [1]. > > [1] https://flink.apache.org/downloads.html > > Cheers, > Till > > On Mon, Feb 3, 2020 at 10:19 AM Hequn Cheng <he...@apache.org> wrote: > > > Thank you all for your feedback and suggestions! > > > > Best, Hequn > > > > On Mon, Feb 3, 2020 at 5:07 PM Becket Qin <becket....@gmail.com> wrote: > > > > > Thanks for bringing up the discussion, Hequn. > > > > > > +1 on adding `flink-ml-api` and `flink-ml-lib` into opt. This would > make > > > it much easier for the users to try out some simple ml tasks. > > > > > > Thanks, > > > > > > Jiangjie (Becket) Qin > > > > > > On Mon, Feb 3, 2020 at 4:34 PM jincheng sun <sunjincheng...@gmail.com> > > > wrote: > > > > > >> Thank you for pushing forward @Hequn Cheng <he...@apache.org> ! > > >> > > >> Hi @Becket Qin <becket....@gmail.com> , Do you have any concerns on > > >> this ? > > >> > > >> Best, > > >> Jincheng > > >> > > >> Hequn Cheng <he...@apache.org> 于2020年2月3日周一 下午2:09写道: > > >> > > >>> Hi everyone, > > >>> > > >>> Thanks for the feedback. As there are no objections, I've opened a > JIRA > > >>> issue(FLINK-15847[1]) to address this issue. > > >>> The implementation details can be discussed in the issue or in the > > >>> following PR. > > >>> > > >>> Best, > > >>> Hequn > > >>> > > >>> [1] https://issues.apache.org/jira/browse/FLINK-15847 > > >>> > > >>> On Wed, Jan 8, 2020 at 9:15 PM Hequn Cheng <chenghe...@gmail.com> > > wrote: > > >>> > > >>> > Hi Jincheng, > > >>> > > > >>> > Thanks a lot for your feedback! > > >>> > Yes, I agree with you. There are cases that multi jars need to be > > >>> > uploaded. I will prepare another discussion later. Maybe with a > > simple > > >>> > design doc. > > >>> > > > >>> > Best, Hequn > > >>> > > > >>> > On Wed, Jan 8, 2020 at 3:06 PM jincheng sun < > > sunjincheng...@gmail.com> > > >>> > wrote: > > >>> > > > >>> >> Thanks for bring up this discussion Hequn! > > >>> >> > > >>> >> +1 for include `flink-ml-api` and `flink-ml-lib` in opt. > > >>> >> > > >>> >> BTW: I think would be great if bring up a discussion for upload > > >>> multiple > > >>> >> Jars at the same time. as PyFlink JOB also can have the benefit if > > we > > >>> do > > >>> >> that improvement. > > >>> >> > > >>> >> Best, > > >>> >> Jincheng > > >>> >> > > >>> >> > > >>> >> Hequn Cheng <chenghe...@gmail.com> 于2020年1月8日周三 上午11:50写道: > > >>> >> > > >>> >> > Hi everyone, > > >>> >> > > > >>> >> > FLIP-39[1] rebuilds Flink ML pipeline on top of TableAPI which > > moves > > >>> >> Flink > > >>> >> > ML a step further. Base on it, users can develop their ML jobs > and > > >>> more > > >>> >> and > > >>> >> > more machine learning platforms are providing ML services. > > >>> >> > > > >>> >> > However, the problem now is the jars of flink-ml-api and > > >>> flink-ml-lib > > >>> >> are > > >>> >> > only exist on maven repo. Whenever users want to submit ML jobs, > > >>> they > > >>> >> can > > >>> >> > only depend on the ml modules and package a fat jar. This would > be > > >>> >> > inconvenient especially for the machine learning platforms on > > which > > >>> >> nearly > > >>> >> > all jobs depend on Flink ML modules and have to package a fat > jar. > > >>> >> > > > >>> >> > Given this, it would be better to include jars of flink-ml-api > and > > >>> >> > flink-ml-lib in the `opt` folder, so that users can directly use > > the > > >>> >> jars > > >>> >> > with the binary release. For example, users can move the jars > into > > >>> the > > >>> >> > `lib` folder or use -j to upload the jars. (Currently, -j only > > >>> support > > >>> >> > upload one jar. Supporting multi jars for -j can be discussed in > > >>> another > > >>> >> > discussion.) > > >>> >> > > > >>> >> > Putting the jars in the `opt` folder instead of the `lib` folder > > is > > >>> >> because > > >>> >> > currently, the ml jars are still optional for the Flink project > by > > >>> >> default. > > >>> >> > > > >>> >> > What do you think? Welcome any feedback! > > >>> >> > > > >>> >> > Best, > > >>> >> > > > >>> >> > Hequn > > >>> >> > > > >>> >> > [1] > > >>> >> > > > >>> >> > > > >>> >> > > >>> > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-39+Flink+ML+pipeline+and+ML+libs > > >>> >> > > > >>> >> > > >>> > > > >>> > > >> > > >