[jira] [Updated] (HUDI-2488) Support async metadata index creation while regular writers and table services are in progress

Sagar Sumit (Jira) Tue, 25 Jan 2022 08:44:45 -0800


     [ 
https://issues.apache.org/jira/browse/HUDI-2488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Sagar Sumit updated HUDI-2488:
------------------------------
    Issue Type: Epic  (was: New Feature)

> Support async metadata index creation while regular writers and table 
> services are in progress
> ----------------------------------------------------------------------------------------------
>
>                 Key: HUDI-2488
>                 URL: https://issues.apache.org/jira/browse/HUDI-2488
>             Project: Apache Hudi
>          Issue Type: Epic
>            Reporter: sivabalan narayanan
>            Assignee: Sagar Sumit
>            Priority: Blocker
>              Labels: pull-request-available
>             Fix For: 0.11.0
>
>         Attachments: image-2021-11-17-11-04-09-713.png
>
>
> For now, we have only FILES partition in metadata table. and our suggestion 
> is to stop all processes and then restart one by one by enabling metadata 
> table. first process to start back will invoke bootstrapping of the metadata 
> table. 
>  
> But this may not work out well as we add more and more partitions to metadata 
> table. 
> We need to support bootstrapping a single or more partitions in metadata 
> table while regular writers and table services are in progress. 
>  
>  
> Penning down my thoughts/idea. 
> I tried to find a way to get this done w/o adding an additional lock, but 
> could not crack that. So, here is one way to support async bootstrap. 
>  
> Introducing a file called "available_partitions" in some special file under 
> metadata table. This file will contain the list of partitions that are 
> available to apply updates from data table. i.e. when we do synchronous 
> updates from data table to metadata table, when we have N no of partitions in 
> metadata table, we need to know what partitions are fully bootstrapped and 
> ready to take updates. this file will assist in maintaining that info. We can 
> debate on how to maintain this info (tbl props, or separate file etc, but for 
> now let's say this file is the source of truth). Idea here is that, any async 
> bootstrap process will update this file with the new partition that got 
> bootstrapped once the bootstrap is fully complete. So that all other writers 
> will know what partitions to update. 
> Add we need to introduce a metadata_lock as well. 
>  
> here is how writers and async bootstrap will pan out.
>  
> Regular writer or any async table service(compaction, etc): 
>     when changes are required to be applied to metadata table: // fyi. as of 
> today this already happens within data table lock. 
>            Take metadata_lock
>                   read contents of available_partitions. 
>                   prep records and apply updates to metadata table.
>            release lock.
>  
> Async bootstrap process:
>      Start bootstrapping of a given partition (eg files) in metadata table.
>      do it in a loop. i.e. first iteration of bootstrap could take 10 mins 
> for eg. and then again catch up new commits that happened in the last 10 mins 
> which could take 1 min for instance. and then again go for another loop. 
>      Whenever total bootstrap time for a round is ~ 1min or less, in the next 
> round, we can go in for final iteration.
>            During the final iteration, take the metadata_lock. // this lock 
> should not be held for more than few secs. 
>                      apply any new commits that happened while last iteration 
> of bootstrap was happening. 
>                      update "available_partitions" file with this partition 
> info that got fully bootstrapped. 
>           release lock.
>  
> metadata_lock: will ensure when async bootstrap is in final stages of 
> bootstrapping, we should not miss any commits that is nearing completion. So, 
> we ought to take a lock to ensure we don't miss out on any commits. Either 
> async bootstrap will apply the update, or the actual writer itself will 
> update directly if bootstrap is fully complete. 
>  
> Rgdn "available_partitions": 
> I was looking for a way to know what partitions are fully ready to take in 
> direct updates from regular writers and hence chose this way. We can also 
> think about creating a temp_partition(files_temp or something) while 
> bootstrap in progress and then rename to original partition name once 
> bootstrap is fully complete. If we can ensure reliably renaming of these 
> partitions(i.e, once files partition is available, it is fully ready to take 
> in direct updates), we can take this route as well. 
> Here is how it might pan out w/ folder/partition renaming.
>  
> Regular writer or any async table service(compaction, etc): 
>     when changes are required to be applied to metadata table: // fyi. as of 
> today this already happens within data table lock. 
>            Take metadata_lock
>                   list partitions in metadata table. ignore temp partitions.  
>                   prep records and apply updates to metadata table.
>            release lock.
>  
> Async bootstrap process:
>      Start bootstrapping of a given partition (eg files) in metadata table. 
> create a temp folder for partition thats getting bootstrapped. (for eg: 
> files_temp)
>      do it in a loop. i.e. first iteration of bootstrap could take 10 mins 
> for eg. and then again catch up new commits that happened in the last 10 mins 
> which could take 1 min for instance. and then again go for another loop. 
>      Whenever total bootstrap time for a round is ~ 1min or less, in the next 
> round, we can go in for final iteration.
>            During the final iteration, take the metadata_lock. // this lock 
> should not be held for more than few secs. 
>                      apply any new commits that happened while last iteration 
> of bootstrap was happening. 
>                      rename files_temp to files. 
>           release lock.
> Note: we just need to ensure that folder renaming is consistent. On crash, 
> either new folder is fully intact or not available. contents of old folder 
> does not matter. 
>  
> Failures: 
> a. if bootstrap failed midway, until "files" hasn't been created, we can 
> delete files_temp and start all over again. 
> b. if bootstrap failed just after rename, again we should be good. Just that 
> lock may not have been released. We need to ensure the metadata lock is 
> released. So, to tackle this, if acquiring metadata_lock from regular writer 
> fails, we will just proceed onto listing partitions and applying updates. 
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HUDI-2488) Support async metadata index creation while regular writers and table services are in progress

Reply via email to