lincoln lee created FLINK-28242:
-----------------------------------

             Summary: CDC source with meta columns may cause error result on 
downstream stateful operators
                 Key: FLINK-28242
                 URL: https://issues.apache.org/jira/browse/FLINK-28242
             Project: Flink
          Issue Type: Bug
          Components: Table SQL / Runtime
    Affects Versions: 1.15.0
            Reporter: lincoln lee


The intermediate result of current test case 
temporalJoinITCase#testEventTimeMultiTemporalJoin is wrong:

{code}

+I,    5,RMB,40,2020-08-16T00:03,null,null,null,null
+I,    2,US 
Dollar,1,2020-08-15T00:02,102,2020-08-15T00:00:02,102,2020-08-15T00:00:02
+I,    3,RMB,40,2020-08-15T00:03,702,2020-08-15T00:00:04,702,2020-08-15T00:00:04
-U,   2,US Dollar,1,2020-08-16T00:03,106,2020-08-16T00:02,106,2020-08-16T00:02

...

{code}

because the "-U,   2,US Dollar,1,2020-08-16T00:03..." has a different 
'order_time' column against "+I,    2,US Dollar,1,2020-08-15T00:02...", and 
after join there's no upsert key, so downstream operator can only do retract by 
the complete row, and will fail at this case.

The root cause is when cdc source carries meta data column (e.g., operation 
time in binlog or operation type, which will make the delete|update_before 
message not exactly the same as the previous version), and after some 
operations like join (not on the primary key of cdc source, the output will 
have no upsert key anymore), then downstream operator can not do retract 
correctly.

This is obscure to users, but we should think of a way to at least report the 
error to users (during compiling), or other solution eliminate the problem 
completely.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to