Hi,Benchao,
        Welcome to join the discussion, yes, this new syntax can make SQL more 
clear and simpler.
        For your first question, the `window_start` and `window_end` columns 
will be added automatically,
        so we don't need to use auxiliary group functions to infer or access 
the window properties.
        
        For the `grouping sets` on TVFs, I think it's interesting if we can 
support it, as we already supported `grouping sets`
        on streaming aggregates in blink planner. But I'm not sure if it will 
be included into this FLIP.

        cc @Jark Wu

Best,
Pengcheng
        

在 2020/10/9 下午5:25,“Benchao Li”<libenc...@apache.org> 写入:

    Thanks Jark for bringing this discussion, I like this FLIP very much.

    Especially the cumulate window, it's much like the current TUMBLE window +
    Fast Emit (which is an undocumented experimental feature), however, it's
    more powerful.

    And This will make the SQL semantic more standard, especially for the
    HOPPING window.

    Regarding time attribute,
    It seems that we don't need a specific function to infer the time attribute
    like
    `TUMBLE_ROWTIME` / `TUMBLE_PROCTIME`. Then are `window_start` and
    `window_end`
    column a time attribute column automatically?
    - If not, what will be the time attribute of the result relation of these
    TVFs?
      Especially after the window aggregation.
    - If yes, then how do we handle proctime?

    Regarding batch operators,
    It's great to hear that we can reuse the batch operators in continuous
    batch mode
    as you mentioned in the FLIP.
    Current window aggregate could also be used in batch mode with rowtime. Do
    you plan
    to support these TVFs for batch mode in this FLIP? Hence the Table/SQL is a
    unified
    API, it's great if we can keep the features complete both in streaming and
    batch mode.

    There is one more question, I don't know whether it should be considered in
    this FLIP.
    Does the new window support `grouping sets`? (It's not supported in old
    window impl).

    Jark Wu <imj...@gmail.com> 于2020年10月9日周五 下午4:14写道:

    > Hi all,
    >
    > I know we have a lot of discussion and development on going right now but
    > it would be great if we can get FLIP-145 into a votable state.
    > If there are no objections, I would like to start voting in the next days.
    >
    > Best,
    > Jark
    >
    > On Thu, 1 Oct 2020 at 14:29, Jark Wu <imj...@gmail.com> wrote:
    >
    > > Hi everyone,
    > >
    > > I have added a section for Performance Optimization to describe how to
    > > improve the performance in the short-term and long-term
    > > and sketch the future performance potential under the new window API.
    > > Introducing the window API is just the first step, we will
    > > continuously improve the performance to make it powerful and useful.
    > >
    > > Best,
    > > Jark
    > >
    > > On Thu, 1 Oct 2020 at 14:28, Jark Wu <imj...@gmail.com> wrote:
    > >
    > >> Hi Pengcheng,
    > >>
    > >> Yes, the window TVF is part of the FLIP. Welcome to contribute and join
    > >> the discussion.
    > >> Regarding the SESSION window aggregation, users can use the existing
    > >> grouped session window function.
    > >>
    > >> Best,
    > >> Jark
    > >>
    > >> On Sun, 27 Sep 2020 at 21:24, liupengcheng <pengchengliucr...@gmail.com
    > >
    > >> wrote:
    > >>
    > >>> Hi Jark,
    > >>>         Thanks for reply, yes, I think it's a good feature, it can
    > >>> improve the NRT scenarios
    > >>>         as you mentioned in the FLIP. Also, I think it can improve the
    > >>> streaming SQL greatly,
    > >>>         it can support richer window operations in flink SQL and bring
    > >>> great convenience to users.
    > >>>         (we are now only supported group window in flink).
    > >>>
    > >>>         Regarding the SESSION window, I think it's especially useful
    > for
    > >>> user behavior analysis(e.g.
    > >>>         counting user visits on a news website or social platform), 
but
    > >>> I agree that we can keep it
    > >>>         out of the FLIP now to catch up 1.12.
    > >>>
    > >>>         Recently, I've done some work on the stream planner with the
    > >>> TVFs, and I'm willing to contribute
    > >>>         to this part. Is it in the plan of this FLIP?
    > >>>
    > >>>         Best,
    > >>>         PengchengLiu
    > >>>
    > >>>
    > >>> 在 2020/9/26 下午11:09,“Jark Wu”<imj...@gmail.com> 写入:
    > >>>
    > >>>     Hi pengcheng,
    > >>>
    > >>>     That's great to see you also have the need of window join.
    > >>>     You are right, the windowing TVF is a powerful feature which can
    > >>> support
    > >>>     more operations in the future.
    > >>>     I think it as of the date time "partition" selection in batch SQL
    > >>> jobs,
    > >>>     with this new syntax, I think it is possible
    > >>>      to migrate traditional batch SQL jobs to Flink SQL by changing a
    > >>> few lines.
    > >>>
    > >>>     Regarding the SESSION window, this is on purpose to keep it out of
    > >>> the
    > >>>     FLIP, because we want to keep the
    > >>>     FLIP small to catch up 1.12 and SESSION TVF is rarely useful (e.g.
    > >>> session
    > >>>     window join?).
    > >>>
    > >>>     Best,
    > >>>     Jark
    > >>>
    > >>>     On Fri, 25 Sep 2020 at 22:59, liupengcheng <
    > >>> pengchengliucr...@gmail.com>
    > >>>     wrote:
    > >>>
    > >>>     > Hi, Jark,
    > >>>     >         I'm very interested in this feature, and I'm also 
working
    > >>> on this
    > >>>     > recently.
    > >>>     >         I just have a glance at the FLIP, it's good, but I found
    > >>> that
    > >>>     > there is no plan to add SESSION windows.
    > >>>     >         Also, I think there can be more things we can do based 
on
    > >>> this new
    > >>>     > syntax. For example,
    > >>>     >         - window sort support
    > >>>     >         - window union/intersect/minus support
    > >>>     >         - Improve dimension table join
    > >>>     >         We can have more deep discussion on this new feature
    > later
    > >>> .
    > >>>     >         I've also opened an jira that is related to this feature
    > >>> recently:
    > >>>     > https://issues.apache.org/jira/browse/FLINK-18830
    > >>>     >
    > >>>     > Best!
    > >>>     > PengchengLiu
    > >>>     >
    > >>>     > 在 2020/9/25 下午10:30,“Jark Wu”<imj...@gmail.com> 写入:
    > >>>     >
    > >>>     >     Hi everyone,
    > >>>     >
    > >>>     >     I want to start a FLIP about supporting windowing
    > table-valued
    > >>>     > functions
    > >>>     >     (TVF).
    > >>>     >     The main purpose of this FLIP is to improve the near
    > real-time
    > >>> (NRT)
    > >>>     >     experience of Flink.
    > >>>     >
    > >>>     >     FLIP-145:
    > >>>     >
    > >>>     >
    > >>>
    > 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-145%3A+Support+SQL+windowing+table-valued+function
    > >>>     >
    > >>>     >     We want to introduce TUMBLE, HOP, CUMULATE windowing TVFs,
    > the
    > >>>     > CUMULATE is
    > >>>     >     a new kind of window.
    > >>>     >     With the windowing TVFs, we can support richer operations on
    > >>> windows,
    > >>>     >     including window join, window TopN and so on.
    > >>>     >     This makes things simple: we only need to assign windows at
    > the
    > >>>     > beginning
    > >>>     >     of the query, and then apply operations after that like
    > >>> traditional
    > >>>     > batch
    > >>>     >     SQL.
    > >>>     >     We hope it can help to reduce the learning curve of windows,
    > >>> improve
    > >>>     > NRT
    > >>>     >     for Flink, and attract more batch users.
    > >>>     >
    > >>>     >     A simple code snippet for 10 minutes tumbling window
    > aggregate:
    > >>>     >
    > >>>     >     SELECT window_start, window_end, SUM(price)
    > >>>     >     FROM TABLE(
    > >>>     >         TUMBLE(TABLE Bid, DESCRIPTOR(bidtime), INTERVAL '10'
    > >>> MINUTES))
    > >>>     >     GROUP BY window_start, window_end;
    > >>>     >
    > >>>     >     I'm looking forward to your feedback.
    > >>>     >
    > >>>     >     Best,
    > >>>     >     Jark
    > >>>     >
    > >>>     >
    > >>>     >
    > >>>
    > >>>
    > >>>
    >


    -- 

    Best,
    Benchao Li

Reply via email to