[ 
https://issues.apache.org/jira/browse/HUDI-2752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17445798#comment-17445798
 ] 

Danny Chen commented on HUDI-2752:
----------------------------------

A very common case is the CDC data stream into Hudi, if the DELETEs and INSERTs 
are out of order, or if there are multiple  INSERTs and DELETEs  in one insert 
batch, the data lost happens.

> The MOR DELETE block breaks the event time sequence of CDC
> ----------------------------------------------------------
>
>                 Key: HUDI-2752
>                 URL: https://issues.apache.org/jira/browse/HUDI-2752
>             Project: Apache Hudi
>          Issue Type: Sub-task
>          Components: Flink Integration
>            Reporter: Danny Chen
>            Priority: Major
>             Fix For: 0.11.0
>
>
> Currently, the DELETE blocks are always written after the data blocks for one 
> batch of data write, when there are INSERT/UPDATEs after the DELETE, the data 
> would lost.
> What i can thought of is that the DELETE block should at least keep the event 
> time sequence for #preCombine with other record payloads.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to