[ 
https://issues.apache.org/jira/browse/FLINK-6047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaoxuan Wang updated FLINK-6047:
---------------------------------
    Description: 
[Design doc]:
https://docs.google.com/document/d/18XlGPcfsGbnPSApRipJDLPg5IFNGTQjnz7emkVpZlkw

[Introduction]:
"Retraction" is an important building block for data streaming to refine the 
early fired results in streaming. “Early firing” are very common and widely 
used in many streaming scenarios, for instance “window-less” or unbounded 
aggregate and stream-stream inner join, windowed (with early firing) aggregate 
and stream-stream inner join. There are mainly two cases that require 
retractions: 1) update on the keyed table (the key is either a primaryKey (PK) 
on source table, or a groupKey/partitionKey in an aggregate); 2) When dynamic 
windows (e.g., session window) are in use, the new value may be replacing more 
than one previous window due to window merging. 

To the best of our knowledge, the retraction for the early fired streaming 
results has never been practically solved before. In this proposal, we develop 
a retraction solution and explain how it works for the problem of “update on 
the keyed table”. The same solution can be easily extended for the dynamic 
windows merging, as the key component of retraction - how to refine an early 
fired results - is the same across different problems.  

[Proposed Jiras]:
Implement decoration phase for rewriting predicated logical plan after volcano 
optimization phase
Implement optimizer for retraction and turn on retraction for over window 
aggregate
Implement and turn on the retraction for grouping window aggregate
Implement and turn on retraction for table source
Implement and turn on retraction for table sink
Implement and turn on retraction for stream-stream inner join
Implement the retraction for the early firing window
Implement the retraction for the dynamic window with early firing



  was:
[Design doc]:
https://docs.google.com/document/d/18XlGPcfsGbnPSApRipJDLPg5IFNGTQjnz7emkVpZlkw

[Introduction]:
"Retraction" is an important building block for data streaming to refine the 
early fired results in streaming. “Early firing” are very common and widely 
used in many streaming scenarios, for instance “window-less” or unbounded 
aggregate and stream-stream inner join, windowed (with early firing) aggregate 
and stream-stream inner join. There are mainly two cases that require 
retractions: 1) update on the keyed table (the key is either a primaryKey (PK) 
on source table, or a groupKey/partitionKey in an aggregate); 2) When dynamic 
windows (e.g., session window) are in use, the new value may be replacing more 
than one previous window due to window merging. 

To the best of our knowledge, the retraction for the early fired streaming 
results has never been practically solved before. In this proposal, we develop 
a retraction solution and explain how it works for the problem of “update on 
the keyed table”. The same solution can be easily extended for the dynamic 
windows merging, as the key component of retraction - how to refine an early 
fired results - is the same across different problems.  

[Proposed Jiras]:
Implement decoration phase for predicated logical plan rewriting after volcano 
optimization phase
Add source with table primary key and replace table property
Add sink tableInsert and NeedRetract property
Implement the retraction for partitioned unbounded over window aggregate
Implement the retraction for stream-stream inner join
Implement the retraction for the early firing window
Implement the retraction for the dynamic window with early firing




> Master Jira for "Retraction for Flink Streaming"
> ------------------------------------------------
>
>                 Key: FLINK-6047
>                 URL: https://issues.apache.org/jira/browse/FLINK-6047
>             Project: Flink
>          Issue Type: New Feature
>            Reporter: Shaoxuan Wang
>            Assignee: Shaoxuan Wang
>
> [Design doc]:
> https://docs.google.com/document/d/18XlGPcfsGbnPSApRipJDLPg5IFNGTQjnz7emkVpZlkw
> [Introduction]:
> "Retraction" is an important building block for data streaming to refine the 
> early fired results in streaming. “Early firing” are very common and widely 
> used in many streaming scenarios, for instance “window-less” or unbounded 
> aggregate and stream-stream inner join, windowed (with early firing) 
> aggregate and stream-stream inner join. There are mainly two cases that 
> require retractions: 1) update on the keyed table (the key is either a 
> primaryKey (PK) on source table, or a groupKey/partitionKey in an aggregate); 
> 2) When dynamic windows (e.g., session window) are in use, the new value may 
> be replacing more than one previous window due to window merging. 
> To the best of our knowledge, the retraction for the early fired streaming 
> results has never been practically solved before. In this proposal, we 
> develop a retraction solution and explain how it works for the problem of 
> “update on the keyed table”. The same solution can be easily extended for the 
> dynamic windows merging, as the key component of retraction - how to refine 
> an early fired results - is the same across different problems.  
> [Proposed Jiras]:
> Implement decoration phase for rewriting predicated logical plan after 
> volcano optimization phase
> Implement optimizer for retraction and turn on retraction for over window 
> aggregate
> Implement and turn on the retraction for grouping window aggregate
> Implement and turn on retraction for table source
> Implement and turn on retraction for table sink
> Implement and turn on retraction for stream-stream inner join
> Implement the retraction for the early firing window
> Implement the retraction for the dynamic window with early firing



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to