[jira] [Commented] (FLINK-4557) Table API Stream Aggregations

sunjincheng (JIRA) Wed, 15 Feb 2017 21:51:07 -0800

    [ 
https://issues.apache.org/jira/browse/FLINK-4557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15869275#comment-15869275
 ]


sunjincheng commented on FLINK-4557:
------------------------------------

Hi,guys，I made a preliminary implementation of this JIRA.
My approach is:
1. Calcite -> Flink
"LogicalProject with RexOver expression" – (normalize rule) -> "Calcite's 
LogicalWindow" – (opt rule) -> DataStreamRowWindowAggregate
2. datastreamAPI:
a. With partitionBy situation: 
approach1: inputDS.map().keyBy().reduce().map() //we prefer this approach, 
because we can use the current reduce state.
approach2: inputDS.map().keyBy().process()
b. Without partitionBy situation: 
inputDS.map().setParallelism(1), map has implement CheckPointedFunction.
3. About OrderBy:
According to the natural order of elements, procTime () use for generate 
end-time of the window and guaranteed pass the sql validation.
HI,Fabian Hueske IMO. Since all the above design has not been implemented yet 
in flink master, if I put all of my design into one PR, it will be very huge. I 
would like to split the design into the following subtasks:
1. first submit one JIRA.&PR with "#1Calcite -> FlINK part #2dataStreamAPI a. 
rowWindow with partitionBy"
2. then submit #2dataStreamAPI b.rowWindow without partitionBy
Does this make sense to you? It would be very appreciated if you could give 
some advice.

> Table API Stream Aggregations
> -----------------------------
>
>                 Key: FLINK-4557
>                 URL: https://issues.apache.org/jira/browse/FLINK-4557
>             Project: Flink
>          Issue Type: New Feature
>          Components: Table API & SQL
>            Reporter: Timo Walther
>
> The Table API is a declarative API to define queries on static and streaming 
> tables. So far, only projection, selection, and union are supported 
> operations on streaming tables.
> This issue and the corresponding FLIP proposes to add support for different 
> types of aggregations on top of streaming tables. In particular, we seek to 
> support:
> *Group-window aggregates*, i.e., aggregates which are computed for a group of 
> elements. A (time or row-count) window is required to bound the infinite 
> input stream into a finite group.
> *Row-window aggregates*, i.e., aggregates which are computed for each row, 
> based on a window (range) of preceding and succeeding rows.
> Each type of aggregate shall be supported on keyed/grouped or 
> non-keyed/grouped data streams for streaming tables as well as batch tables.
> Since time-windowed aggregates will be the first operation that require the 
> definition of time, we also need to discuss how the Table API handles time 
> characteristics, timestamps, and watermarks.
> The FLIP can be found here: 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-11%3A+Table+API+Stream+Aggregations



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-4557) Table API Stream Aggregations

Reply via email to