[ 
https://issues.apache.org/jira/browse/HUDI-2961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-2961:
----------------------------
    Story Points: 2

> Async table services can race with metadata table updates
> ---------------------------------------------------------
>
>                 Key: HUDI-2961
>                 URL: https://issues.apache.org/jira/browse/HUDI-2961
>             Project: Apache Hudi
>          Issue Type: Task
>          Components: writer-core
>            Reporter: Manoj Govindassamy
>            Assignee: Ethan Guo
>            Priority: Blocker
>             Fix For: 0.11.0
>
>
> Today Metadata table updates are done inline/synchronous with the data table 
> updates. Metadata data table updates can also sometime trigger table services 
> like compaction which are also done inline w.r.t the ongoing commit. So, 
> updates in the metadata table are always serial. However, there can be async 
> table services like clustering which are running in parallel with single or 
> multiple writers and can update the metadata table in parallel with the 
> writer commits. 
> In the multi writer case, since we anyway have the lock provider configured 
> metadata table updates are guarded for race. But, the lock providers are not 
> must today for single writer + async table service deployments, leading to 
> race in metadata table updates. Async table service like clustering can race 
> with the metadata table compaction, and can update the wrong delta log file 
> than the right next delta file from the compaction.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to