lincoln lee created FLINK-28242: ----------------------------------- Summary: CDC source with meta columns may cause error result on downstream stateful operators Key: FLINK-28242 URL: https://issues.apache.org/jira/browse/FLINK-28242 Project: Flink Issue Type: Bug Components: Table SQL / Runtime Affects Versions: 1.15.0 Reporter: lincoln lee
The intermediate result of current test case temporalJoinITCase#testEventTimeMultiTemporalJoin is wrong: {code} +I, 5,RMB,40,2020-08-16T00:03,null,null,null,null +I, 2,US Dollar,1,2020-08-15T00:02,102,2020-08-15T00:00:02,102,2020-08-15T00:00:02 +I, 3,RMB,40,2020-08-15T00:03,702,2020-08-15T00:00:04,702,2020-08-15T00:00:04 -U, 2,US Dollar,1,2020-08-16T00:03,106,2020-08-16T00:02,106,2020-08-16T00:02 ... {code} because the "-U, 2,US Dollar,1,2020-08-16T00:03..." has a different 'order_time' column against "+I, 2,US Dollar,1,2020-08-15T00:02...", and after join there's no upsert key, so downstream operator can only do retract by the complete row, and will fail at this case. The root cause is when cdc source carries meta data column (e.g., operation time in binlog or operation type, which will make the delete|update_before message not exactly the same as the previous version), and after some operations like join (not on the primary key of cdc source, the output will have no upsert key anymore), then downstream operator can not do retract correctly. This is obscure to users, but we should think of a way to at least report the error to users (during compiling), or other solution eliminate the problem completely. -- This message was sent by Atlassian Jira (v8.20.7#820007)