[ 
https://issues.apache.org/jira/browse/HUDI-2659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

suheng.cloud updated HUDI-2659:
-------------------------------
    Affects Version/s: 0.10.0

> concurrent compaction problem on flink sql
> ------------------------------------------
>
>                 Key: HUDI-2659
>                 URL: https://issues.apache.org/jira/browse/HUDI-2659
>             Project: Apache Hudi
>          Issue Type: Bug
>          Components: Flink Integration
>    Affects Versions: 0.10.0
>            Reporter: suheng.cloud
>            Priority: Major
>         Attachments: image-2021-11-01-13-14-40-831.png, 
> image-2021-11-01-13-16-28-695.png
>
>
> Hi, Community:
> We continously watch the flink compact task, and found there maybe some issue 
> after the job run 2 days.
> The taskmanager log shows that the 2 compact plan executed in sequence, in 
> witch the former commit action delete the basefile(for some duplicated 
> reason?) which was a dependence of the latter one.
> I wonder will this cause data lost in final ?
> the core flink sink table params are:
> {code:java}
> 'table.type' = 'MERGE_ON_READ','table.type' = 'MERGE_ON_READ', 
> 'write.operation'='upsert', 'read.streaming.enabled' = 'true', 
> 'hive_sync.enable' = 'false', 'write.precombine.field'='ts', 
> 'compaction.trigger.strategy'='num_commits', 'compaction.delta_commits'= '5', 
> 'compaction.tasks'='4', 'compaction.max_memory'='10',    
> 'hoodie.parquet.max.file.size'='20971520',    
> 'hoodie.parquet.small.file.limit'='10485760',    
> 'write.log.max.size'='52428800', 'compaction.target_io'='5120', 
> 'changelog.enabled'='false', 'clean.retain_commits'='20', 
> 'archive.max_commits'='30', 'archive.min_commits'='20'{code}
>  
> cc [~danny0405],  can you also give some suggestion :)
> Thank you all~
>  
>  
> !image-2021-11-01-13-14-40-831.png!
>  
> !image-2021-11-01-13-16-28-695.png!



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to