Hello everyone, Flink is widely used in Alibaba Group, especially in our Search and Recommendation Infra. Retraction is one of the most important features that we needed. We have spent lots of efforts to try to solve this problem, and gladly at the end we develop an approach which can address most of retraction problems in our production scenarios. Same as usual, we (Alibaba search-data infra team) would like to share our retraction solution to the entire Flink community. If you like this proposal, I would also like to make it as one of the FLIPs. I am attaching the design doc of "Retraction for Flink Streaming" as well as the introduction section below. I have also created a master jira (FLINK-6047) to track the discussion and design of the Flink retraction. All suggestions and comments are welcome.
*Design doc:* https://docs.google.com/document/d/18XlGPcfsGbnPSApRipJDLPg5IFNGTQjnz7emkVpZlkw *Introduction:* "Retraction" is an important building block for data streaming to refine the early fired results in streaming. “Early firing” are very common and widely used in many streaming scenarios, for instance “window-less” or unbounded aggregate and stream-stream inner join, windowed (with early firing) aggregate and stream-stream inner join. As described in Streaming 102, there are mainly two cases that require retractions: 1) update on the keyed table (the key is either a primaryKey (PK) on source table, or a groupKey/partitionKey in an aggregate); 2) When dynamic windows (e.g., session window) are in use, the new value may be replacing more than one previous window due to window merging. To the best of our knowledge, the retraction for the early fired streaming results has never been practically solved before. In this proposal, we develop a retraction solution and explain how it works for the problem of “update on the keyed table”. The same solution can be easily extended for the dynamic windows merging, as the key component of retraction - how to refine an early fired results - is the same across different problems. *Master Jira: * https://issues.apache.org/jira/browse/FLINK-6047 Regards, Shaoxuan