Hi everybody,

it seems that currently several contributors are working on new features
for the streaming Table API / SQL around row windows (as defined in FLIP-11
[1]) and SQL OVER-style window (FLINK-4678, FLINK-4679, FLINK-4680,
FLINK-5584).
Since these efforts overlap quite a bit I spent some time thinking about
how we can approach these features and how to avoid overlapping
contributions.

The challenge here is the following. Some of the Table API row windows as
defined by FLIP-11 [1] are basically SQL OVER windows while other cannot be
easily expressed as such (TumbleRows for row-count intervals, SessionRows).
However, since Calcite already supports SQL OVER windows, we can reuse the
optimization logic for some of the Table API row windows. I also thought
about the semantics of the TumbleRows and SessionRows windows as defined in
FLIP-11 and came to the conclusion that these are not well defined in
FLIP-11 and should rather be defined as SlideRows windows with a special
PARTITION BY clause.

I propose to approach SQL OVER windows and Table API row windows as follows:

We start with three simple cases for SQL OVER windows (not Table API yet):

* OVER RANGE for event time
* OVER RANGE for processing time
* OVER ROW for processing time

All cases fulfill the following restrictions:
- All aggregations in SELECT must refer to the same window.
- PARTITION BY may not contain the rowtime attribute.
- ORDER BY must be on rowtime attribute (for event time) or on a marker
function that indicates processing time. Additional sort attributes are not
supported initially.
- only "BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW" and "BETWEEN x
PRECEDING AND CURRENT ROW" are supported.

OVER ROW for event time cannot be easily supported. With event time, we may
have late records which need to be injected into the order of records. When
a record in injected in to the order where a row-count window has already
been computed, this and all following windows will change. We could either
drop the record or sent out many retraction records. I think it is best to
not open this can of worms at this point.

The rational for all of the above restrictions is to have first versions of
OVER windows soon.
Once we have the above cases covered we can extend and remove limitations
as follows:

- Table API SlideRow windows (with the same restrictions as above). This
will be mostly API work since the execution part has been solved before.
- Add support for FOLLOWING (except UNBOUNDED FOLLOWING)
- Add support for different windows in SELECT. All windows must be
partitioned and ordered in the same way.
- Add support for additional ORDER BY attributes (besides time).

As I said before, TumbleRows and SessionRows windows as in FLIP-11 are not
well defined, IMO.
They can be expressed as SlideRows windows with special partitioning
(partitioning on fixed, non-overlapping time ranges for TumbleRows, and
gap-separated, non-overlapping time ranges for SessionRows)
I would not start to work on those yet.

I would like to close all related JIRA issues (FLINK-4678, FLINK-4679,
FLINK-4680, FLINK-5584) and restructure the development of these features
as outlined above with corresponding JIRA issues.

What do others think? (I cc'ed the contributors assigned to the above JIRA
issues)

Best, Fabian

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-11%3A+Table+API+Stream+Aggregations

Reply via email to