Re: [DISCUSS] Enhancing the functionality and productivity of Table API

2018-11-22 Thread jincheng sun
Hi Fabian, Yes, Timers is not only the difference between Table and DataStream, but also the difference between DataStream and DataSet. We need to unify the batch and Stream in Table, so the difference about timers needs to be considered in depth. :) Thanks, Jincheng Fabian Hueske 于2018年11月15日

Re: [DISCUSS] Enhancing the functionality and productivity of Table API

2018-11-15 Thread Fabian Hueske
Thanks Jincheng, That makes sense to me. Another differentiation of Table API and DataStream API would be the access to the timer service. The DataStream API can register and act on timers while the Table API would not have this feature. Best, Fabian Am Mi., 14. Nov. 2018 um 02:02 Uhr schrieb ji

Re: [DISCUSS] Enhancing the functionality and productivity of Table API

2018-11-13 Thread jincheng sun
Hi Piotrek, Fabian: I am very glad to see your reply. Thank you very much Piotrek for asking very good questions. I will share my opinion: - The Enhancing TableAPI that I proposed is proposed for user friendliness. After enhancement, it will maintain the characteristics of TableAPI&SQL,

Re: [DISCUSS] Enhancing the functionality and productivity of Table API

2018-11-13 Thread Fabian Hueske
Yes, that is my understanding as well. Manual time management would be another difference. Something still to be discussed would be whether (or to what extent) it would be possible to define the physical execution plan with hints or methods like partitionByHash and sortPartition. Best, Fabian Am

Re: [DISCUSS] Enhancing the functionality and productivity of Table API

2018-11-13 Thread Piotr Nowojski
Hi, > This thread is meant to enhancing the functionalities of TableAPI. I don't > think that anyone is suggesting either reducing the effort in SQL or > DataStream. So let's focus on how we can enhance TableAPI. I wasn’t thinking about that. As I said before, I was rising a question, what Table

Re: [DISCUSS] Enhancing the functionality and productivity of Table API

2018-11-07 Thread Xiaowei Jiang
Hi Piotr: I want to clarify one thing first: I think that we will keep the interoperability between TableAPI and DataStream in any case. So user can switch between the two whenever needed. Given that, it would still be very helpful that users can use one API to achieve most of what they do. Curren

Re: [DISCUSS] Enhancing the functionality and productivity of Table API

2018-11-06 Thread Shaoxuan Wang
Hi all, Thanks for the feedback. I enjoyed the discussions, especially the ones between Fabian and Xiaowei. I think it well revealed the motivations and design pros/cons behind this proposal. Enhancing tableAPI will not affect and limit the improvements on Flink SQL (as well as DataStream). Actual

Re: [DISCUSS] Enhancing the functionality and productivity of Table API

2018-11-06 Thread SHI Xiaogang
Hi all, Thank you for your replies and comments. I have similar consideration like Piotrek. My opinion is that two APIs are enough for Flink, a declarative one (SQL) and one imperative one (DataStream). From my perspective, most of users prefer SQL at most time and turn to Data Stream when the l

Re: [DISCUSS] Enhancing the functionality and productivity of Table API

2018-11-06 Thread Piotr Nowojski
Hi, What is our intended division/border between Table API and DataSet or DataStream? If we want Table API to drift away from SQL that would be a valid question. > Another distinguishing feature of DataStream API is that users get direct > access to state/statebackend which we intensionally avo

Re: [DISCUSS] Enhancing the functionality and productivity of Table API

2018-11-06 Thread Fabian Hueske
Hi, An analysis of orthogonal functions would be great! There is certainly some overlap in the functions provided by the DataSet API. In the past, I found that having low-level functions helped a lot to efficiently implement complex logic. Without partitionByHash, sortPartition, sort, mapPartitio

Re: [DISCUSS] Enhancing the functionality and productivity of Table API

2018-11-06 Thread jincheng sun
Hi Fabian, Thank you for your deep thoughts in this regard, I think most of questions you had mentioned are very worthy of in-depth discussion! I want share thoughts about following questions: 1. Do we need move all DataSet API functionality into the Table API? I think most of dataset functionalit

Re: [DISCUSS] Enhancing the functionality and productivity of Table API

2018-11-06 Thread Xiaowei Jiang
Hi Fabian, I totally agree with you that we should incrementally improve TableAPI. We don't suggest that we do anything drastic such as replacing DataSet API yet. We should see how much we can achieve by extending TableAPI cleanly. By then, we should see if there are any natural boundaries on how

Re: [DISCUSS] Enhancing the functionality and productivity of Table API

2018-11-06 Thread Fabian Hueske
Thanks for the replies Xiaowei and others! You are right, I did not consider the batch optimization that would be missing if the DataSet API would be ported to extend the DataStream API. By extending the scope of the Table API, we can gain a holistic logical & physical optimization which would be

Re: [DISCUSS] Enhancing the functionality and productivity of Table API

2018-11-06 Thread jincheng sun
Hi Jark, Glad to see your feedback! That's Correct, The proposal is aiming to extend the functionality for Table API! I like add "drop" to fit the use case you mentioned. Not only that, if a 100-columns Table. and our UDF needs these 100 columns, we don't want to define the eval as eval(column0...c

Re: [DISCUSS] Enhancing the functionality and productivity of Table API

2018-11-05 Thread Jark Wu
Hi jingcheng, Thanks for your proposal. I think it is a helpful enhancement for TableAPI which is a solid step forward for TableAPI. It doesn't weaken SQL or DataStream, because the conversion between DataStream and Table still works. People with advanced cases (e.g. complex and fine-grained state

Re: [DISCUSS] Enhancing the functionality and productivity of Table API

2018-11-05 Thread jincheng sun
Hi Rong Rong, Sorry for the late reply, And thanks for your feedback! We will continue to add more convenience features to the TableAPI, such as map, flatmap, agg, flatagg, iteration etc. And I am very happy that you are interested on this proposal. Due to this is a long-term continuous work, we

Re: [DISCUSS] Enhancing the functionality and productivity of Table API

2018-11-05 Thread Becket Qin
Hi Xinggang, Thanks for the comments. Please see the responses inline below. On Tue, Nov 6, 2018 at 11:28 AM SHI Xiaogang wrote: > Hi all, > > I think it's good to enhance the functionality and productivity of Table > API, but still I think SQL + DataStream is a better choice from user > experi

Re: [DISCUSS] Enhancing the functionality and productivity of Table API

2018-11-05 Thread jincheng sun
Hi Xiaogang, Thanks for your feedback, I will share my thoughts here: First, enhancing TableAPI does not mean weakening SQL. We also need to enhance the functionality of SQL, such as @Xuefu's ongoing integration of the hive SQL ecosystem. In addition,SQL and TableAPI are two different API forms

Re: [DISCUSS] Enhancing the functionality and productivity of Table API

2018-11-05 Thread SHI Xiaogang
Hi all, I think it's good to enhance the functionality and productivity of Table API, but still I think SQL + DataStream is a better choice from user experience 1. The unification of batch and stream processing is very attractive, and many our users are moving their batch-processing applications t

Re: [DISCUSS] Enhancing the functionality and productivity of Table API

2018-11-05 Thread Xiaowei Jiang
Hi Fabian, these are great questions! I have some quick thoughts on some of these. Optimization opportunities: I think that you are right UDFs are more like blackboxes today. However this can change if we let user develop UDFs symbolically in the future (i.e., Flink will look inside the UDF code,

Re: [DISCUSS] Enhancing the functionality and productivity of Table API

2018-11-05 Thread Fabian Hueske
Hi Jincheng, Thanks for this interesting proposal. I like that we can push this effort forward in a very fine-grained manner, i.e., incrementally adding more APIs to the Table API. However, I also have a few questions / concerns. Today, the Table API is tightly integrated with the DataSet and Dat

Re: [DISCUSS] Enhancing the functionality and productivity of Table API

2018-11-04 Thread Rong Rong
Hi Jincheng, Thank you for the proposal! I think being able to define a process / co-process function in table API definitely opens up a whole new level of applications using a unified API. In addition, as Tzu-Li and Hequn have mentioned, the benefit of optimization layer of Table API will alread

Re: [DISCUSS] Enhancing the functionality and productivity of Table API

2018-11-04 Thread jincheng sun
Hi tison, Thanks a lot for your feedback! I am very happy to see that community contributors agree to enhanced the TableAPI. This work is a long-term continuous work, we will push it in stages, we will soon complete the enhanced list of the first phase, we can go deep discussion in google doc. t

Re: [DISCUSS] Enhancing the functionality and productivity of Table API

2018-11-01 Thread Tzu-Li Chen
Hi jingchengm Thanks a lot for your proposal! I find it is a good start point for internal optimization works and help Flink to be more user-friendly. AFAIK, DataStream is the most popular API currently that Flink users should describe their logic with detailed logic. >From a more internal view t

Re: [DISCUSS] Enhancing the functionality and productivity of Table API

2018-11-01 Thread jincheng sun
Hi Hequn, Thanks for your feedback! And also thanks for our offline discussion! You are right, unification of batch and streaming is very important for flink API. We will provide more detailed design later, Please let me know if you have further thoughts or feedback. Thanks, Jincheng Hequn Cheng

Re: [DISCUSS] Enhancing the functionality and productivity of Table API

2018-11-01 Thread jincheng sun
Hi, Jiangjie, Thanks a lot for your feedback. And also thanks for our offline discussion! Yes, your right! The Row-based APIs which you mentioned are very friendly to flink user! In order to follow the concept of the traditional database, perhaps we named the corresponding function RowValued/TabeVa

Re: [DISCUSS] Enhancing the functionality and productivity of Table API

2018-11-01 Thread Hequn Cheng
Hi Jincheng, Thanks a lot for your proposal. It is very encouraging! As we all know, SQL is a widely used language. It follows standards, is a descriptive language, and is easy to use. A powerful feature of SQL is that it supports optimization. Users only need to care about the logic of the progr

Re: [DISCUSS] Enhancing the functionality and productivity of Table API

2018-11-01 Thread Shaoxuan Wang
Hi Aljoscha, Glad that you like the proposal. We have completed the prototype of most new proposed functionalities. Once collect the feedback from community, we will come up with a concrete FLIP/design doc. Regards, Shaoxuan On Thu, Nov 1, 2018 at 8:12 PM Aljoscha Krettek wrote: > Hi Jincheng,

Re: [DISCUSS] Enhancing the functionality and productivity of Table API

2018-11-01 Thread Becket Qin
Thanks for the proposal, Jincheng. This makes a lot of sense. As a programming interface, Table API is especially attractive because it supports both batch and stream. However, the relational-only API often forces users to shoehorn their logic into a bunch of user defined functions. Introducing so

Re: [DISCUSS] Enhancing the functionality and productivity of Table API

2018-11-01 Thread jincheng sun
Hi, Timo, I am very grateful for your feedback, and I am very excited when I hear that you also consider adding a process function to the TableAPI. I agree that add support for the Process Function on the Table API, which is actually part of my proposal Enhancing the functionality of Table API. In

Re: [DISCUSS] Enhancing the functionality and productivity of Table API

2018-11-01 Thread Aljoscha Krettek
Yes, that makes sense! > On 1. Nov 2018, at 15:51, jincheng sun wrote: > > Hi, Aljoscha, > > Thanks for your feedback and suggestions. I think your are right, the > detailed design/FLIP is very necessary. Before the detailed design or open > a FLIP, I would like to hear the community's views on

Re: [DISCUSS] Enhancing the functionality and productivity of Table API

2018-11-01 Thread jincheng sun
Hi, Aljoscha, Thanks for your feedback and suggestions. I think your are right, the detailed design/FLIP is very necessary. Before the detailed design or open a FLIP, I would like to hear the community's views on Enhancing the functionality and productivity of Table API, to ensure that it worth t

Re: [DISCUSS] Enhancing the functionality and productivity of Table API

2018-11-01 Thread Timo Walther
Hi Jincheng, I was also thinking about introducing a process function for the Table API several times. This would allow to define more complex logic (custom windows, timers, etc.) embedded into a relational API with schema awareness and optimization around the black box. Of course this would

Re: [DISCUSS] Enhancing the functionality and productivity of Table API

2018-11-01 Thread Aljoscha Krettek
Hi Jincheng, these points sound very good! Are there any concrete proposals for changes? For example a FLIP/design document? See here for FLIPs: https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals Best, Aljoscha > On 1. Nov 2018, at 12:51, jincheng sun wrote: > >

Re: [DISCUSS] Enhancing the functionality and productivity of Table API

2018-11-01 Thread jincheng sun
*I am sorry for the formatting of the email content. I reformat the **content** as follows---* *Hi ALL,* With the continuous efforts from the community, the Flink system has been continuously improved, which has attracted more and more users. Flink SQL is a canonical, widely used

[DISCUSS] Enhancing the functionality and productivity of Table API

2018-11-01 Thread jincheng sun
Hi all, With the continuous efforts from the community, the Flink system has been continuously improved, which has attracted more and more users. Flink SQL is a canonical, widely used relational query language. However, there are still some scenarios where Flink SQL failed to meet user needs in te