Re: [DISCUSS] Unified Core API for Streaming and Batch

2018-12-07 Thread Haibo Sun
Hi All, Thank Aljoscha for further spitting up topics. I will start separate threads on each topic which you propose. Best, Haibo Aljoscha Krettek-2 wrote > Hi All, > > this is a great discussion! (I have some thoughts on most of the topics > but I'll wait for the separate discussion threads

Re: [DISCUSS] Unified Core API for Streaming and Batch

2018-12-07 Thread Aljoscha Krettek
Hi All, this is a great discussion! (I have some thoughts on most of the topics but I'll wait for the separate discussion threads) @Haibo Will you start a separate threads? I think the separate discussion topics would be (based on Stephans mail but further split up): 1. What should the API sta

Re: [DISCUSS] Unified Core API for Streaming and Batch

2018-12-07 Thread Shuai Xu
Hi all Glad to see the discussion, we are now designing to enhance the scheduling of batch job, a unified api will help a lot. Haibo Sun 于2018年12月5日周三 下午4:45写道: > Hi all, > > Thank Kurt, you see more benefits of the unification than I do. > > I quite agree Kurt's views. DataStream, DataSet and T

Re: [DISCUSS] Unified Core API for Streaming and Batch

2018-12-05 Thread Haibo Sun
Hi all, Thank Kurt, you see more benefits of the unification than I do. I quite agree Kurt's views. DataStream, DataSet and Table are remained independent for now, and subsumed DataSet in data stream in the future. The collection execution mode is replaced by mini cluster. The high-level semantic

Re: [DISCUSS] Unified Core API for Streaming and Batch

2018-12-04 Thread Guowei Ma
Hi, all Thanks to Haibo for initiating this discussion in the community. - Relationship of DataStream, DataSet, and Table API Table/DataStream/Dataset does have different aspects. For example, DataStream can access State and Table cannot. DataStream can be easily extended by users because they d

Re: [DISCUSS] Unified Core API for Streaming and Batch

2018-12-04 Thread Kurt Young
Hi all, Really excited to see this discussion really happens, I also want to share my two cents here. Lets first focus on this question: “What Flink API Stack Should be for a Unified Engine". There are multiply ways to judge whether an engine is unified or not. From user's perspective, as long as

Re: [DISCUSS] Unified Core API for Streaming and Batch

2018-12-03 Thread Wang Feng
Hi Stephan: I totally agree with you, this discussion covers too many topics, so we can cut it into a series of sub-discussions proposed by you, firstly we can focus on phrase-1: “What Flink API Stack Should be for a Unified Engine”. Best, Feng Wang On Dec 3, 2018, at 19:36, Stephan Ewen mailt

Re: [DISCUSS] Unified Core API for Streaming and Batch

2018-12-03 Thread Stephan Ewen
Hi all! This is a great discussion to start and I agree with the idea behind it. We should get started designing what the Flink stack should look like in the future. This discussion is very big, though, and from past experiences if the scope is too big, the discussions and up falling apart when e

Re:[DISCUSS] Unified Core API for Streaming and Batch

2018-12-03 Thread Haibo Sun
Thanks, zhijiang. For the optimization, such as cost-based estimation, we still want to keep it in the data set layer, but your suggestion is also a thought that can be considered. As I know, currently these batch scenarios have been contained in DataSet, such as the sort-merge join algorit

Re: [DISCUSS] Unified Core API for Streaming and Batch

2018-12-03 Thread jincheng sun
Hi Haibo, Thank you for this great proposal! Flink is a unified computing engine. It has been unified at the TableAPI and SQLAPI levels (not yet complete). It's greate If we can unify the DataSet API and DataStream API. I also want to convert to StreamTransformation in the SQL and Table API, bec