[ https://issues.apache.org/jira/browse/HUDI-3644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ASF GitHub Bot updated HUDI-3644: --------------------------------- Labels: pull-request-available (was: ) > hoodie log scan bug cause data duplication > ------------------------------------------ > > Key: HUDI-3644 > URL: https://issues.apache.org/jira/browse/HUDI-3644 > Project: Apache Hudi > Issue Type: Bug > Reporter: hd zhou > Priority: Major > Labels: pull-request-available > > AbstractHoodieLogRecordReader > > {code:java} > //代码占位符 > if (!completedInstantsTimeline.containsOrBeforeTimelineStarts(instantTime) > || inflightInstantsTimeline.containsInstant(instantTime)) { > // hit an uncommitted block possibly from a failed write, move to the next > one and skip processing this one > continue; > } {code} > > completedInstantsTimeline.containsOrBeforeTimelineStarts(instantTime) is > true will merge log file. this is not good. > > when log file block append sucess. And deltacommit rollback. And this > instance time is not before activeTimeline starts. This log file block will > be merged, cause data duplication. > > -- This message was sent by Atlassian Jira (v8.20.1#820001)