[ 
https://issues.apache.org/jira/browse/FLINK-5386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15825546#comment-15825546
 ] 

sunjincheng commented on FLINK-5386:
------------------------------------

Hi [~fhueske] [~shaoxuan] thanks for the reply .
[~fhueske] You are right, no matter it is a stream table or a batch table, we 
need to ensure the correctness. As you said we must check the window's 
properties at the implementation phase. I agree with you.

BTW, "Groupby ('w)" is not only consistent with the row-window, but also 
consistent with the calcite SQL. For instance:

GroupBy:
{code}
SELECT STREAM TUMBLE_END(rowtime, INTERVAL '1' HOUR) AS rowtime,
  productId,
  COUNT(*) AS c,
  SUM(units) AS units
FROM Orders
GROUP BY TUMBLE(rowtime, INTERVAL '1' HOUR), productId;
{code}

Over:
{code}
SELECT STREAM *
FROM (
  SELECT STREAM rowtime,
    productId,
    units,
    AVG(units) OVER product (RANGE INTERVAL '10' MINUTE PRECEDING) AS m10,
    AVG(units) OVER product (RANGE INTERVAL '7' DAY PRECEDING) AS d7
  FROM Orders
  WINDOW product AS (
    ORDER BY rowtime
    PARTITION BY productId))
WHERE m10 > d7;
{code}

The following two statements are supported by the current changes:
#1. windows are defined at the start and used later:
{code}
val windowedTable = table
 .window(Slide over 10.milli every 5.milli as 'w1)
 .window(Tumble over 5.milli  as 'w2)
 .groupBy('w1, 'key)
 .select('string, 'int.count as 'count, 'w1.start)
 .groupBy( 'w2, 'key)
 .select('string, 'count.sum as sum2)
{code}

#2. windows are defined with groupBy:
{code}
 val windowedTable = table
 .window(Slide over 10.milli every 5.milli as 'w1)
 .groupBy('w1, 'key)
 .select('string, 'int.count as 'count, 'w1.start)
 .window(Tumble over 5.milli  as 'w2)
 .groupBy( 'w2, 'key)
 .select('string, 'count.sum as sum2)
{code}
I hope this makes sense to you? 
You said "by tying window and groupBy together, we could avoid such situations" 
is just like # 2 or must be written "groupBy (). Window ()"?

reference:
 Azure: https://msdn.microsoft.com/en-us/library/azure/dn835051.aspx
Calcite: http://calcite.apache.org/docs/stream.html#tumbling-windows

> Refactoring Window Clause
> -------------------------
>
>                 Key: FLINK-5386
>                 URL: https://issues.apache.org/jira/browse/FLINK-5386
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Table API & SQL
>            Reporter: sunjincheng
>            Assignee: sunjincheng
>
> Similar to the SQL, window clause is defined "as" a symbol which is 
> explicitly used in groupby/over. We are proposing to refactor the way to 
> write groupby+window tableAPI as follows: 
> {code}
> val windowedTable = table
>  .window(Slide over 10.milli every 5.milli as 'w1)
>  .window(Tumble over 5.milli  as 'w2)
>  .groupBy('w1, 'key)
>  .select('string, 'int.count as 'count, 'w1.start)
>  .groupBy( 'w2, 'key)
>  .select('string, 'count.sum as sum2)
>  .window(Tumble over 5.milli  as 'w3)
>  .groupBy( 'w3) // windowAll
>  .select('sum2, 'w3.start, 'w3.end)
> {code}
> In this way, we can remove both GroupWindowedTable and the window() method in 
> GroupedTable which makes the API a bit clean. In addition, for row-window, we 
> anyway need to define window clause as a symbol. This change will make the 
> API of window and row-window consistent, example for row-window:
> {code}
>   .window(RowXXXWindow as ‘x, RowYYYWindow as ‘y)
>   .select(‘a, ‘b.count over ‘x as ‘xcnt, ‘c.count over ‘y as ‘ycnt, ‘x.start, 
> ‘x.end)
> {code}
> What do you think? [~fhueske] [~twalthr]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to