[ https://issues.apache.org/jira/browse/FLINK-6047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Shaoxuan Wang updated FLINK-6047: --------------------------------- Description: [Design doc]: https://docs.google.com/document/d/18XlGPcfsGbnPSApRipJDLPg5IFNGTQjnz7emkVpZlkw [Introduction]: "Retraction" is an important building block for data streaming to refine the early fired results in streaming. “Early firing” are very common and widely used in many streaming scenarios, for instance “window-less” or unbounded aggregate and stream-stream inner join, windowed (with early firing) aggregate and stream-stream inner join. There are mainly two cases that require retractions: 1) update on the keyed table (the key is either a primaryKey (PK) on source table, or a groupKey/partitionKey in an aggregate); 2) When dynamic windows (e.g., session window) are in use, the new value may be replacing more than one previous window due to window merging. To the best of our knowledge, the retraction for the early fired streaming results has never been practically solved before. In this proposal, we develop a retraction solution and explain how it works for the problem of “update on the keyed table”. The same solution can be easily extended for the dynamic windows merging, as the key component of retraction - how to refine an early fired results - is the same across different problems. [Proposed Jiras]: Implement decoration phase for rewriting predicated logical plan after volcano optimization phase Implement optimizer for retraction and turn on retraction for over window aggregate Implement and turn on the retraction for grouping window aggregate Implement and turn on retraction for table source Implement and turn on retraction for table sink Implement and turn on retraction for stream-stream inner join Implement the retraction for the early firing window Implement the retraction for the dynamic window with early firing was: [Design doc]: https://docs.google.com/document/d/18XlGPcfsGbnPSApRipJDLPg5IFNGTQjnz7emkVpZlkw [Introduction]: "Retraction" is an important building block for data streaming to refine the early fired results in streaming. “Early firing” are very common and widely used in many streaming scenarios, for instance “window-less” or unbounded aggregate and stream-stream inner join, windowed (with early firing) aggregate and stream-stream inner join. There are mainly two cases that require retractions: 1) update on the keyed table (the key is either a primaryKey (PK) on source table, or a groupKey/partitionKey in an aggregate); 2) When dynamic windows (e.g., session window) are in use, the new value may be replacing more than one previous window due to window merging. To the best of our knowledge, the retraction for the early fired streaming results has never been practically solved before. In this proposal, we develop a retraction solution and explain how it works for the problem of “update on the keyed table”. The same solution can be easily extended for the dynamic windows merging, as the key component of retraction - how to refine an early fired results - is the same across different problems. [Proposed Jiras]: Implement decoration phase for predicated logical plan rewriting after volcano optimization phase Add source with table primary key and replace table property Add sink tableInsert and NeedRetract property Implement the retraction for partitioned unbounded over window aggregate Implement the retraction for stream-stream inner join Implement the retraction for the early firing window Implement the retraction for the dynamic window with early firing > Master Jira for "Retraction for Flink Streaming" > ------------------------------------------------ > > Key: FLINK-6047 > URL: https://issues.apache.org/jira/browse/FLINK-6047 > Project: Flink > Issue Type: New Feature > Reporter: Shaoxuan Wang > Assignee: Shaoxuan Wang > > [Design doc]: > https://docs.google.com/document/d/18XlGPcfsGbnPSApRipJDLPg5IFNGTQjnz7emkVpZlkw > [Introduction]: > "Retraction" is an important building block for data streaming to refine the > early fired results in streaming. “Early firing” are very common and widely > used in many streaming scenarios, for instance “window-less” or unbounded > aggregate and stream-stream inner join, windowed (with early firing) > aggregate and stream-stream inner join. There are mainly two cases that > require retractions: 1) update on the keyed table (the key is either a > primaryKey (PK) on source table, or a groupKey/partitionKey in an aggregate); > 2) When dynamic windows (e.g., session window) are in use, the new value may > be replacing more than one previous window due to window merging. > To the best of our knowledge, the retraction for the early fired streaming > results has never been practically solved before. In this proposal, we > develop a retraction solution and explain how it works for the problem of > “update on the keyed table”. The same solution can be easily extended for the > dynamic windows merging, as the key component of retraction - how to refine > an early fired results - is the same across different problems. > [Proposed Jiras]: > Implement decoration phase for rewriting predicated logical plan after > volcano optimization phase > Implement optimizer for retraction and turn on retraction for over window > aggregate > Implement and turn on the retraction for grouping window aggregate > Implement and turn on retraction for table source > Implement and turn on retraction for table sink > Implement and turn on retraction for stream-stream inner join > Implement the retraction for the early firing window > Implement the retraction for the dynamic window with early firing -- This message was sent by Atlassian JIRA (v6.3.15#6346)