Hi all, In the review of PR for FLINK-12473, there were a few comments regarding pipeline exportation. We would like to start a follow up discussions to address some related comments.
Currently, FLIP-39 proposal gives a way for users to persist a pipeline in JSON format. But it does not specify how users can export a pipeline for serving purpose. We summarized some thoughts on this in the following doc. https://docs.google.com/document/d/1B84b-1CvOXtwWQ6_tQyiaHwnSeiRqh-V96Or8uHqCp8/edit?usp=sharing After we reach consensus on the pipeline exportation, we will add a corresponding section in FLIP-39. Shaoxuan Wang <wshaox...@gmail.com> 于2019年6月5日周三 上午8:47写道: > Stavros, > They have the similar logic concept, but the implementation details are > quite different. It is hard to migrate the interface with different > implementations. The built-in algorithms are useful legacy that we will > consider migrate to the new API (but still with different implementations). > BTW, the new API has already been merged via FLINK-12473. > > Thanks, > Shaoxuan > > > > On Mon, Jun 3, 2019 at 6:08 PM Stavros Kontopoulos < > st.kontopou...@gmail.com> > wrote: > > > Hi, > > > > Some portion of the code could be migrated to the new Table API no? > > I am saying that because the new API design is based on scikit-learn and > > the old one was also inspired by it. > > > > Best, > > Stavros > > On Wed, May 22, 2019 at 1:24 PM Shaoxuan Wang <wshaox...@gmail.com> > wrote: > > > > > Another consensus (from the offline discussion) is that we will > > > delete/deprecate flink-libraries/flink-ml. I have started a survey and > > > discussion [1] in dev/user-ml to collect the feedback. Depending on the > > > replies, we will decide if we shall delete it in Flink1.9 or > > > deprecate&delete in the next release after 1.9. > > > > > > [1] > > > > > > > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/SURVEY-Usage-of-flink-ml-and-DISCUSS-Delete-flink-ml-td29057.html > > > > > > Regards, > > > Shaoxuan > > > > > > > > > On Tue, May 21, 2019 at 9:22 PM Gen Luo <luogen...@gmail.com> wrote: > > > > > > > Yes, this is our conclusion. I'd like to add only one point that > > > > registering user defined aggregator is also needed which is currently > > > > provided by 'bridge' and finally will be merged into Table API. It's > > same > > > > with collect(). > > > > > > > > I will add a TableEnvironment argument in Estimator.fit() and > > > > Transformer.transform() to get rid of the dependency on > > > > flink-table-planner. This will be committed soon. > > > > > > > > Aljoscha Krettek <aljos...@apache.org> 于2019年5月21日周二 下午7:31写道: > > > > > > > > > We discussed this in private and came to the conclusion that we > > should > > > > > (for now) have the dependency on flink-table-api-xxx-bridge because > > we > > > > need > > > > > access to the collect() method, which is not yet available in the > > Table > > > > > API. Once that is available the code can be refactored but for now > we > > > > want > > > > > to unblock work on this new module. > > > > > > > > > > We also agreed that we don’t need a direct dependency on > > > > > flink-table-planner. > > > > > > > > > > I hope I summarised our discussion correctly. > > > > > > > > > > > On 17. May 2019, at 12:20, Gen Luo <luogen...@gmail.com> wrote: > > > > > > > > > > > > Thanks for your reply. > > > > > > > > > > > > For the first question, it's not strictly necessary. But I perfer > > not > > > > to > > > > > > have a TableEnvironment argument in Estimator.fit() or > > > > > > Transformer.transform(), which is not part of machine learning > > > concept, > > > > > and > > > > > > may make our API not as clean and pretty as other systems do. I > > would > > > > > like > > > > > > another way other than introducing flink-table-planner to do > this. > > If > > > > > it's > > > > > > impossible or severely opposed, I may make the concession to add > > the > > > > > > argument. > > > > > > > > > > > > Other than that, "flink-table-api-xxx-bridge"s are still needed. > A > > > vary > > > > > > common case is that an algorithm needs to guarantee that it's > > running > > > > > under > > > > > > a BatchTableEnvironment, which makes it possible to collect > result > > > each > > > > > > iteration. A typical algorithm like this is ALS. By flink1.8, > this > > > can > > > > be > > > > > > only achieved by converting Table to DataSet than call > > > > DataSet.collect(), > > > > > > which is available in flink-table-api-xxx-bridge. Besides, > > > registering > > > > > > UDAGG is also depending on it. > > > > > > > > > > > > In conclusion, '"planner" can be removed from dependencies but > > > > > introducing > > > > > > "bridge"s are inevitable. Whether and how to acquire > > TableEnvironment > > > > > from > > > > > > a Table can be discussed. > > > > > > > > > > > > > > > > > > > >