Re: [DISCUSS] Enhancing the functionality and productivity of Table API

Shaoxuan Wang Thu, 01 Nov 2018 18:48:48 -0700

Hi Aljoscha，
Glad that you like the proposal. We have completed the prototype of most
new proposed functionalities. Once collect the feedback from community, we
will come up with a concrete FLIP/design doc.


Regards,
Shaoxuan


On Thu, Nov 1, 2018 at 8:12 PM Aljoscha Krettek <aljos...@apache.org> wrote:

> Hi Jincheng,
>
> these points sound very good! Are there any concrete proposals for
> changes? For example a FLIP/design document?
>
> See here for FLIPs:
> https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals
>
> Best,
> Aljoscha
>
> > On 1. Nov 2018, at 12:51, jincheng sun <sunjincheng...@gmail.com> wrote:
> >
> > *--------I am sorry for the formatting of the email content. I reformat
> > the **content** as follows-----------*
> >
> > *Hi ALL,*
> >
> > With the continuous efforts from the community, the Flink system has been
> > continuously improved, which has attracted more and more users. Flink SQL
> > is a canonical, widely used relational query language. However, there are
> > still some scenarios where Flink SQL failed to meet user needs in terms
> of
> > functionality and ease of use, such as:
> >
> > *1. In terms of functionality*
> >    Iteration, user-defined window, user-defined join, user-defined
> > GroupReduce, etc. Users cannot express them with SQL;
> >
> > *2. In terms of ease of use*
> >
> >   - Map - e.g. “dataStream.map(mapFun)”. Although “table.select(udf1(),
> >   udf2(), udf3()....)” can be used to accomplish the same function.,
> with a
> >   map() function returning 100 columns, one has to define or call 100
> UDFs
> >   when using SQL, which is quite involved.
> >   - FlatMap -  e.g. “dataStrem.flatmap(flatMapFun)”. Similarly, it can be
> >   implemented with “table.join(udtf).select()”. However, it is obvious
> that
> >   dataStream is easier to use than SQL.
> >
> > Due to the above two reasons, some users have to use the DataStream API
> or
> > the DataSet API. But when they do that, they lose the unification of
> batch
> > and streaming. They will also lose the sophisticated optimizations such
> as
> > codegen, aggregate join transpose and multi-stage agg from Flink SQL.
> >
> > We believe that enhancing the functionality and productivity is vital for
> > the successful adoption of Table API. To this end,  Table API still
> > requires more efforts from every contributor in the community. We see
> great
> > opportunity in improving our user’s experience from this work. Any
> feedback
> > is welcome.
> >
> > Regards,
> >
> > Jincheng
> >
> > jincheng sun <sunjincheng...@gmail.com> 于2018年11月1日周四 下午5:07写道：
> >
> >> Hi all,
> >>
> >> With the continuous efforts from the community, the Flink system has
> been
> >> continuously improved, which has attracted more and more users. Flink
> SQL
> >> is a canonical, widely used relational query language. However, there
> are
> >> still some scenarios where Flink SQL failed to meet user needs in terms
> of
> >> functionality and ease of use, such as:
> >>
> >>
> >>   -
> >>
> >>   In terms of functionality
> >>
> >> Iteration, user-defined window, user-defined join, user-defined
> >> GroupReduce, etc. Users cannot express them with SQL;
> >>
> >>   -
> >>
> >>   In terms of ease of use
> >>   -
> >>
> >>      Map - e.g. “dataStream.map(mapFun)”. Although “table.select(udf1(),
> >>      udf2(), udf3()....)” can be used to accomplish the same function.,
> with a
> >>      map() function returning 100 columns, one has to define or call
> 100 UDFs
> >>      when using SQL, which is quite involved.
> >>      -
> >>
> >>      FlatMap -  e.g. “dataStrem.flatmap(flatMapFun)”. Similarly, it can
> >>      be implemented with “table.join(udtf).select()”. However, it is
> obvious
> >>      that datastream is easier to use than SQL.
> >>
> >>
> >> Due to the above two reasons, some users have to use the DataStream API
> or
> >> the DataSet API. But when they do that, they lose the unification of
> batch
> >> and streaming. They will also lose the sophisticated optimizations such
> as
> >> codegen, aggregate join transpose  and multi-stage agg from Flink SQL.
> >>
> >> We believe that enhancing the functionality and productivity is vital
> for
> >> the successful adoption of Table API. To this end,  Table API still
> >> requires more efforts from every contributor in the community. We see
> great
> >> opportunity in improving our user’s experience from this work. Any
> feedback
> >> is welcome.
> >>
> >> Regards,
> >>
> >> Jincheng
> >>
> >>
>
>

Re: [DISCUSS] Enhancing the functionality and productivity of Table API

Reply via email to