lincoln lee created FLINK-28242:
-----------------------------------
Summary: CDC source with meta columns may cause error result on
downstream stateful operators
Key: FLINK-28242
URL: https://issues.apache.org/jira/browse/FLINK-28242
Project: Flink
Issue Type: Bug
Components: Table SQL / Runtime
Affects Versions: 1.15.0
Reporter: lincoln lee
The intermediate result of current test case
temporalJoinITCase#testEventTimeMultiTemporalJoin is wrong:
{code}
+I, 5,RMB,40,2020-08-16T00:03,null,null,null,null
+I, 2,US
Dollar,1,2020-08-15T00:02,102,2020-08-15T00:00:02,102,2020-08-15T00:00:02
+I, 3,RMB,40,2020-08-15T00:03,702,2020-08-15T00:00:04,702,2020-08-15T00:00:04
-U, 2,US Dollar,1,2020-08-16T00:03,106,2020-08-16T00:02,106,2020-08-16T00:02
...
{code}
because the "-U, 2,US Dollar,1,2020-08-16T00:03..." has a different
'order_time' column against "+I, 2,US Dollar,1,2020-08-15T00:02...", and
after join there's no upsert key, so downstream operator can only do retract by
the complete row, and will fail at this case.
The root cause is when cdc source carries meta data column (e.g., operation
time in binlog or operation type, which will make the delete|update_before
message not exactly the same as the previous version), and after some
operations like join (not on the primary key of cdc source, the output will
have no upsert key anymore), then downstream operator can not do retract
correctly.
This is obscure to users, but we should think of a way to at least report the
error to users (during compiling), or other solution eliminate the problem
completely.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)