Hi Jincheng, these points sound very good! Are there any concrete proposals for changes? For example a FLIP/design document?
See here for FLIPs: https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals Best, Aljoscha > On 1. Nov 2018, at 12:51, jincheng sun <sunjincheng...@gmail.com> wrote: > > *--------I am sorry for the formatting of the email content. I reformat > the **content** as follows-----------* > > *Hi ALL,* > > With the continuous efforts from the community, the Flink system has been > continuously improved, which has attracted more and more users. Flink SQL > is a canonical, widely used relational query language. However, there are > still some scenarios where Flink SQL failed to meet user needs in terms of > functionality and ease of use, such as: > > *1. In terms of functionality* > Iteration, user-defined window, user-defined join, user-defined > GroupReduce, etc. Users cannot express them with SQL; > > *2. In terms of ease of use* > > - Map - e.g. “dataStream.map(mapFun)”. Although “table.select(udf1(), > udf2(), udf3()....)” can be used to accomplish the same function., with a > map() function returning 100 columns, one has to define or call 100 UDFs > when using SQL, which is quite involved. > - FlatMap - e.g. “dataStrem.flatmap(flatMapFun)”. Similarly, it can be > implemented with “table.join(udtf).select()”. However, it is obvious that > dataStream is easier to use than SQL. > > Due to the above two reasons, some users have to use the DataStream API or > the DataSet API. But when they do that, they lose the unification of batch > and streaming. They will also lose the sophisticated optimizations such as > codegen, aggregate join transpose and multi-stage agg from Flink SQL. > > We believe that enhancing the functionality and productivity is vital for > the successful adoption of Table API. To this end, Table API still > requires more efforts from every contributor in the community. We see great > opportunity in improving our user’s experience from this work. Any feedback > is welcome. > > Regards, > > Jincheng > > jincheng sun <sunjincheng...@gmail.com> 于2018年11月1日周四 下午5:07写道: > >> Hi all, >> >> With the continuous efforts from the community, the Flink system has been >> continuously improved, which has attracted more and more users. Flink SQL >> is a canonical, widely used relational query language. However, there are >> still some scenarios where Flink SQL failed to meet user needs in terms of >> functionality and ease of use, such as: >> >> >> - >> >> In terms of functionality >> >> Iteration, user-defined window, user-defined join, user-defined >> GroupReduce, etc. Users cannot express them with SQL; >> >> - >> >> In terms of ease of use >> - >> >> Map - e.g. “dataStream.map(mapFun)”. Although “table.select(udf1(), >> udf2(), udf3()....)” can be used to accomplish the same function., with >> a >> map() function returning 100 columns, one has to define or call 100 UDFs >> when using SQL, which is quite involved. >> - >> >> FlatMap - e.g. “dataStrem.flatmap(flatMapFun)”. Similarly, it can >> be implemented with “table.join(udtf).select()”. However, it is obvious >> that datastream is easier to use than SQL. >> >> >> Due to the above two reasons, some users have to use the DataStream API or >> the DataSet API. But when they do that, they lose the unification of batch >> and streaming. They will also lose the sophisticated optimizations such as >> codegen, aggregate join transpose and multi-stage agg from Flink SQL. >> >> We believe that enhancing the functionality and productivity is vital for >> the successful adoption of Table API. To this end, Table API still >> requires more efforts from every contributor in the community. We see great >> opportunity in improving our user’s experience from this work. Any feedback >> is welcome. >> >> Regards, >> >> Jincheng >> >>