[ https://issues.apache.org/jira/browse/FLINK-5386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15825546#comment-15825546 ]
sunjincheng commented on FLINK-5386: ------------------------------------ Hi [~fhueske] [~shaoxuan] thanks for the reply . [~fhueske] You are right, no matter it is a stream table or a batch table, we need to ensure the correctness. As you said we must check the window's properties at the implementation phase. I agree with you. BTW, "Groupby ('w)" is not only consistent with the row-window, but also consistent with the calcite SQL. For instance: GroupBy: {code} SELECT STREAM TUMBLE_END(rowtime, INTERVAL '1' HOUR) AS rowtime, productId, COUNT(*) AS c, SUM(units) AS units FROM Orders GROUP BY TUMBLE(rowtime, INTERVAL '1' HOUR), productId; {code} Over: {code} SELECT STREAM * FROM ( SELECT STREAM rowtime, productId, units, AVG(units) OVER product (RANGE INTERVAL '10' MINUTE PRECEDING) AS m10, AVG(units) OVER product (RANGE INTERVAL '7' DAY PRECEDING) AS d7 FROM Orders WINDOW product AS ( ORDER BY rowtime PARTITION BY productId)) WHERE m10 > d7; {code} The following two statements are supported by the current changes: #1. windows are defined at the start and used later: {code} val windowedTable = table .window(Slide over 10.milli every 5.milli as 'w1) .window(Tumble over 5.milli as 'w2) .groupBy('w1, 'key) .select('string, 'int.count as 'count, 'w1.start) .groupBy( 'w2, 'key) .select('string, 'count.sum as sum2) {code} #2. windows are defined with groupBy: {code} val windowedTable = table .window(Slide over 10.milli every 5.milli as 'w1) .groupBy('w1, 'key) .select('string, 'int.count as 'count, 'w1.start) .window(Tumble over 5.milli as 'w2) .groupBy( 'w2, 'key) .select('string, 'count.sum as sum2) {code} I hope this makes sense to you? You said "by tying window and groupBy together, we could avoid such situations" is just like # 2 or must be written "groupBy (). Window ()"? reference: Azure: https://msdn.microsoft.com/en-us/library/azure/dn835051.aspx Calcite: http://calcite.apache.org/docs/stream.html#tumbling-windows > Refactoring Window Clause > ------------------------- > > Key: FLINK-5386 > URL: https://issues.apache.org/jira/browse/FLINK-5386 > Project: Flink > Issue Type: Sub-task > Components: Table API & SQL > Reporter: sunjincheng > Assignee: sunjincheng > > Similar to the SQL, window clause is defined "as" a symbol which is > explicitly used in groupby/over. We are proposing to refactor the way to > write groupby+window tableAPI as follows: > {code} > val windowedTable = table > .window(Slide over 10.milli every 5.milli as 'w1) > .window(Tumble over 5.milli as 'w2) > .groupBy('w1, 'key) > .select('string, 'int.count as 'count, 'w1.start) > .groupBy( 'w2, 'key) > .select('string, 'count.sum as sum2) > .window(Tumble over 5.milli as 'w3) > .groupBy( 'w3) // windowAll > .select('sum2, 'w3.start, 'w3.end) > {code} > In this way, we can remove both GroupWindowedTable and the window() method in > GroupedTable which makes the API a bit clean. In addition, for row-window, we > anyway need to define window clause as a symbol. This change will make the > API of window and row-window consistent, example for row-window: > {code} > .window(RowXXXWindow as ‘x, RowYYYWindow as ‘y) > .select(‘a, ‘b.count over ‘x as ‘xcnt, ‘c.count over ‘y as ‘ycnt, ‘x.start, > ‘x.end) > {code} > What do you think? [~fhueske] [~twalthr] -- This message was sent by Atlassian JIRA (v6.3.4#6332)