Hi Jark,
thanks for the deep investigation and communication with Calcite and
Beam folks.
Given the new findings, +1 to vote.
Regards,
Timo
On 09.11.20 05:22, Jark Wu wrote:
Hi all,
After some offline discussion and investigation with Timo and Danny, I have
updated the FLIP-145.
FLIP-145:
Hi all,
After some offline discussion and investigation with Timo and Danny, I have
updated the FLIP-145.
FLIP-145:
https://cwiki.apache.org/confluence/display/FLINK/FLIP-145%3A+Support+SQL+windowing+table-valued+function
Here are the updates:
1. Add SESSION window syntax and examples.
2. Time A
Hi, Timo ~
> We are not forced by
the standard to do it as stated in the `One SQL to Rule it all` paper
No, slide to the SQL standard is always better, i think this is a common
routine of our Flink SQL now, without a standard, everyone can give a
preference and the discussion would easily go to
window join -> dimension table join -> stream aggregate ->
> >> stream
> >>>>>>> sort
> >>>>>>>>>
> >>>>>>>>> Just as what you said, the key clause can be used to
> >> distinguish
g the example you provided, I think the semantics of the
SQL
in
your example which doing interval join(e.g. with TUMBLE_ROWTIME)
after
window aggregation is not clear in the current implementation,
and I
think
that’s a strong reason why we need the new TVFs syntax.
With the new syntax
the
> >>>>>>> SQL(use the time column as
> >>>>>>> the watermark column) and thus make it work. I think this can
> >>>> save
> >>>>>>> much time to revise the event
> >>>>>>> tim
ded window
operations.
Regarding the example you provided, I think the semantics of the
SQL
in
your example which doing interval join(e.g. with TUMBLE_ROWTIME)
after
window aggregation is not clear in the current implementation,
and I
think
that’s a strong reason why we need the new TVFs syntax.
gt; > >>>
>>>> > >>> IMO, the new windowed operators and the current time operators
>>>> are two
>>>> > >>> different sets of functions,
>>>> > >>> just like time operators and non-time operators are
; can
>>> > >>> use the grouped window aggregates instead of the window TVFs.
>>> > >>>
>>> > >>> The key idea of window TVF is that all the operators in the
>>> pipeline
>>> > are
>>> > >>> based on the **wi
gt;> on, order by) contains window_start and window_end,
>> > >>> it can be translated into windowed operators.
>> > >>> Thus, we will have windowed CEP, windowed sort, windowed over
>> aggregate
>> > >>> in
>> > >>> the
ate the integration more in the future if
> > users
> > >>> need it. Actually, I don't fully understand the scenario of
> integrating
> > >>> window TVF and time operators at this point.
> > >>> For example, interval join an input stream a
can wait for more inputs from users when the window TVF is
> >>> released and we can elaborate it again.
> >>>
> >>> Best,
> >>> Jark
> >>>
> >>> On Sat, 10 Oct 2020 at 12:01, 刘 芃成
> wrote:
> >>>
> >>
k I got your point, actually, in current implementation
>>> for
>>> > group window aggregation, the value of time attributes(e.g.
>>> > TUMBLE_ROWTIME/TUMBLE_PROCTIME) is calculated as (window_end – 1), so I
>>> > think we can just use it directly if you
to use in case of cascaded window
>> operations.
>> > Regarding the example you provided, I think the semantics of the SQL in
>> > your example which doing interval join(e.g. with TUMBLE_ROWTIME) after
>> > window aggregation is not clear in the current implementation
nd I
> think
> > that’s a strong reason why we need the new TVFs syntax.
> > With the new syntax, users should understand which time column to
> > use and how to generate it when doing interval join and etc.
> >
> > Best,
> > Pengcheng
> >
> >
se in case of cascaded window
> > > operations.
> > > Regarding the example you provided, I think the semantics of the SQL in
> > > your example which doing interval join(e.g. with TUMBLE_ROWTIME) after
> > > window aggregation is not clear in the current imple
TVFs syntax.
>> With the new syntax, users should understand which time column to
>> use and how to generate it when doing interval join and etc.
>>
>> Best,
>> Pengcheng
>>
>> 发件人: Benchao Li
>> 日期: 2020年10月10日 星期六 上午11:02
>> 收件人: pengch
think
> that’s a strong reason why we need the new TVFs syntax.
> With the new syntax, users should understand which time column to
> use and how to generate it when doing interval join and etc.
>
> Best,
> Pengcheng
>
> 发件人: Benchao Li
> 日期: 2020年10月10日 星期六 上午11:02
> 收件人
-145: Support SQL windowing table-valued function
Hi pengcheng,
Thanks for your response.
I knew that the original time attribute column will be retained after the TVF,
what I'm questioning is how do we get the time attribute column after
Aggregation.
Your answer did not remove my doubts
Hi pengcheng,
Thanks for your response.
I knew that the original time attribute column will be retained after the
TVF,
what I'm questioning is how do we get the time attribute column after
Aggregation.
Your answer did not remove my doubts about this.
It's ok if we did not plan to integrate new TV
Hi,Benchao,
In TVFs, the time attributes is just passed through from parent rels,
and the TVFs just add two
additional window attributes(i.e. window_start & window_end). Also, I
think the time columns can be not only a time attribute
with type of `TimeIndicatorType` but also a regular c
Hi Jark,
2 & 3 sounds good to me.
Regarding time attribute,
I still have some questions, I knew it's easy to support cascaded window
aggregate using new TVFs.
However there are some other places where need time attribute:
- CEP
- interval join
- order by
- over window
If there is no time attribut
Hi Benchao,
1) time attribute
Yes. We don't need time attribute auxiliary function. Because the new
window operations are all based on the
window_start and window_end columns instead of on the time attributes. So
we don't need to propagate time attributes.
Cascaded window aggregate can be express
Hi,Benchao,
Welcome to join the discussion, yes, this new syntax can make SQL more
clear and simpler.
For your first question, the `window_start` and `window_end` columns
will be added automatically,
so we don't need to use auxiliary group functions to infer or access
the
Thanks Jark for bringing this discussion, I like this FLIP very much.
Especially the cumulate window, it's much like the current TUMBLE window +
Fast Emit (which is an undocumented experimental feature), however, it's
more powerful.
And This will make the SQL semantic more standard, especially fo
Hi all,
I know we have a lot of discussion and development on going right now but
it would be great if we can get FLIP-145 into a votable state.
If there are no objections, I would like to start voting in the next days.
Best,
Jark
On Thu, 1 Oct 2020 at 14:29, Jark Wu wrote:
> Hi everyone,
>
>
Hi everyone,
I have added a section for Performance Optimization to describe how to
improve the performance in the short-term and long-term
and sketch the future performance potential under the new window API.
Introducing the window API is just the first step, we will
continuously improve the perf
Hi Pengcheng,
Yes, the window TVF is part of the FLIP. Welcome to contribute and join the
discussion.
Regarding the SESSION window aggregation, users can use the existing
grouped session window function.
Best,
Jark
On Sun, 27 Sep 2020 at 21:24, liupengcheng
wrote:
> Hi Jark,
> Thanks f
Hi Jark,
Thanks for reply, yes, I think it's a good feature, it can improve the
NRT scenarios
as you mentioned in the FLIP. Also, I think it can improve the
streaming SQL greatly,
it can support richer window operations in flink SQL and bring great
convenience to users.
Hi pengcheng,
That's great to see you also have the need of window join.
You are right, the windowing TVF is a powerful feature which can support
more operations in the future.
I think it as of the date time "partition" selection in batch SQL jobs,
with this new syntax, I think it is possible
to
Hi, Jark,
I'm very interested in this feature, and I'm also working on this
recently.
I just have a glance at the FLIP, it's good, but I found that there is
no plan to add SESSION windows.
Also, I think there can be more things we can do based on this new
syntax. For exam
Hi everyone,
I want to start a FLIP about supporting windowing table-valued functions
(TVF).
The main purpose of this FLIP is to improve the near real-time (NRT)
experience of Flink.
FLIP-145:
https://cwiki.apache.org/confluence/display/FLINK/FLIP-145%3A+Support+SQL+windowing+table-valued+functio
32 matches
Mail list logo