[ 
https://issues.apache.org/jira/browse/FLINK-36750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17899699#comment-17899699
 ] 

Leonard Xu commented on FLINK-36750:
------------------------------------

master:3a2d799fd8eceea76b0abfb066e70b2ca4d58648
3.2: fb2a1d0c38fb34fac028de41bee25b4d5fb39989

> Paimon connector would reuse sequence number when schema evolution happened
> ---------------------------------------------------------------------------
>
>                 Key: FLINK-36750
>                 URL: https://issues.apache.org/jira/browse/FLINK-36750
>             Project: Flink
>          Issue Type: Improvement
>          Components: Flink CDC
>    Affects Versions: cdc-3.2.0
>            Reporter: Yanquan Lv
>            Assignee: Yanquan Lv
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: cdc-3.2.1
>
>         Attachments: image-2024-11-20-13-00-58-282.png, 
> image-2024-11-20-13-02-47-612.png, image-2024-11-20-13-04-53-635.png
>
>
> When schema evolution happened, we will prepare commit and recreate a new 
> FileStoreWrite to obtain the latest schema. However, FileStoreWrite maintain 
> some information like sequence number in memory, we can't directly remove and 
> recreate one FileStoreWrite, instead, we should extract the information of 
> Write and rebuild with this information.
> The  sequence number is used to determine the order of data with two 
> identical primary keys, If we don't strictly maintain this order, it may lead 
> to unexpected situations.
> The following picture show The problem we are currently facing:
> 1) Schema evolution happened between the second and third 
> files(`{*}schema_id{*}` changed)
> !image-2024-11-20-13-04-53-635.png!
> 2)The expected sequence number here should be increasing, however, there is 
> an overlap of `{*}min_sequence_number{*}` between the third file and the 
> second file.
> !image-2024-11-20-13-02-47-612.png!
> Due to the confusion of sequence numbers, we may read the data of 
> update-before.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to