[ https://issues.apache.org/jira/browse/HUDI-2659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
suheng.cloud updated HUDI-2659: ------------------------------- Affects Version/s: 0.10.0 > concurrent compaction problem on flink sql > ------------------------------------------ > > Key: HUDI-2659 > URL: https://issues.apache.org/jira/browse/HUDI-2659 > Project: Apache Hudi > Issue Type: Bug > Components: Flink Integration > Affects Versions: 0.10.0 > Reporter: suheng.cloud > Priority: Major > Attachments: image-2021-11-01-13-14-40-831.png, > image-2021-11-01-13-16-28-695.png > > > Hi, Community: > We continously watch the flink compact task, and found there maybe some issue > after the job run 2 days. > The taskmanager log shows that the 2 compact plan executed in sequence, in > witch the former commit action delete the basefile(for some duplicated > reason?) which was a dependence of the latter one. > I wonder will this cause data lost in final ? > the core flink sink table params are: > {code:java} > 'table.type' = 'MERGE_ON_READ','table.type' = 'MERGE_ON_READ', > 'write.operation'='upsert', 'read.streaming.enabled' = 'true', > 'hive_sync.enable' = 'false', 'write.precombine.field'='ts', > 'compaction.trigger.strategy'='num_commits', 'compaction.delta_commits'= '5', > 'compaction.tasks'='4', 'compaction.max_memory'='10', > 'hoodie.parquet.max.file.size'='20971520', > 'hoodie.parquet.small.file.limit'='10485760', > 'write.log.max.size'='52428800', 'compaction.target_io'='5120', > 'changelog.enabled'='false', 'clean.retain_commits'='20', > 'archive.max_commits'='30', 'archive.min_commits'='20'{code} > > cc [~danny0405], can you also give some suggestion :) > Thank you all~ > > > !image-2021-11-01-13-14-40-831.png! > > !image-2021-11-01-13-16-28-695.png! -- This message was sent by Atlassian Jira (v8.20.1#820001)