Re: [DISCUSS] Enhancing the functionality and productivity of Table API

Aljoscha Krettek Thu, 01 Nov 2018 05:12:40 -0700

Hi Jincheng,

these points sound very good! Are there any concrete proposals for changes? For 
example a FLIP/design document?


See here for FLIPs: 
https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals

Best,
Aljoscha 

> On 1. Nov 2018, at 12:51, jincheng sun <sunjincheng...@gmail.com> wrote:
> 
> *--------I am sorry for the formatting of the email content. I reformat
> the **content** as follows-----------*
> 
> *Hi ALL,*
> 
> With the continuous efforts from the community, the Flink system has been
> continuously improved, which has attracted more and more users. Flink SQL
> is a canonical, widely used relational query language. However, there are
> still some scenarios where Flink SQL failed to meet user needs in terms of
> functionality and ease of use, such as:
> 
> *1. In terms of functionality*
>    Iteration, user-defined window, user-defined join, user-defined
> GroupReduce, etc. Users cannot express them with SQL;
> 
> *2. In terms of ease of use*
> 
>   - Map - e.g. “dataStream.map(mapFun)”. Although “table.select(udf1(),
>   udf2(), udf3()....)” can be used to accomplish the same function., with a
>   map() function returning 100 columns, one has to define or call 100 UDFs
>   when using SQL, which is quite involved.
>   - FlatMap -  e.g. “dataStrem.flatmap(flatMapFun)”. Similarly, it can be
>   implemented with “table.join(udtf).select()”. However, it is obvious that
>   dataStream is easier to use than SQL.
> 
> Due to the above two reasons, some users have to use the DataStream API or
> the DataSet API. But when they do that, they lose the unification of batch
> and streaming. They will also lose the sophisticated optimizations such as
> codegen, aggregate join transpose and multi-stage agg from Flink SQL.
> 
> We believe that enhancing the functionality and productivity is vital for
> the successful adoption of Table API. To this end,  Table API still
> requires more efforts from every contributor in the community. We see great
> opportunity in improving our user’s experience from this work. Any feedback
> is welcome.
> 
> Regards,
> 
> Jincheng
> 
> jincheng sun <sunjincheng...@gmail.com> 于2018年11月1日周四 下午5:07写道：
> 
>> Hi all,
>> 
>> With the continuous efforts from the community, the Flink system has been
>> continuously improved, which has attracted more and more users. Flink SQL
>> is a canonical, widely used relational query language. However, there are
>> still some scenarios where Flink SQL failed to meet user needs in terms of
>> functionality and ease of use, such as:
>> 
>> 
>>   -
>> 
>>   In terms of functionality
>> 
>> Iteration, user-defined window, user-defined join, user-defined
>> GroupReduce, etc. Users cannot express them with SQL;
>> 
>>   -
>> 
>>   In terms of ease of use
>>   -
>> 
>>      Map - e.g. “dataStream.map(mapFun)”. Although “table.select(udf1(),
>>      udf2(), udf3()....)” can be used to accomplish the same function., with 
>> a
>>      map() function returning 100 columns, one has to define or call 100 UDFs
>>      when using SQL, which is quite involved.
>>      -
>> 
>>      FlatMap -  e.g. “dataStrem.flatmap(flatMapFun)”. Similarly, it can
>>      be implemented with “table.join(udtf).select()”. However, it is obvious
>>      that datastream is easier to use than SQL.
>> 
>> 
>> Due to the above two reasons, some users have to use the DataStream API or
>> the DataSet API. But when they do that, they lose the unification of batch
>> and streaming. They will also lose the sophisticated optimizations such as
>> codegen, aggregate join transpose  and multi-stage agg from Flink SQL.
>> 
>> We believe that enhancing the functionality and productivity is vital for
>> the successful adoption of Table API. To this end,  Table API still
>> requires more efforts from every contributor in the community. We see great
>> opportunity in improving our user’s experience from this work. Any feedback
>> is welcome.
>> 
>> Regards,
>> 
>> Jincheng
>> 
>>

Re: [DISCUSS] Enhancing the functionality and productivity of Table API

Reply via email to