codope commented on a change in pull request #4693:
URL: https://github.com/apache/hudi/pull/4693#discussion_r835762079



##########
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java
##########
@@ -645,12 +669,36 @@ private void initializeFileGroups(HoodieTableMetaClient 
dataMetaClient, Metadata
     }
   }
 
+  public void dropIndex(List<MetadataPartitionType> indexesToDrop) throws 
IOException {
+    Set<String> completedIndexes = 
Stream.of(dataMetaClient.getTableConfig().getCompletedMetadataIndexes().split(","))
+        .map(String::trim).filter(s -> 
!s.isEmpty()).collect(Collectors.toSet());
+    Set<String> inflightIndexes = 
Stream.of(dataMetaClient.getTableConfig().getInflightMetadataIndexes().split(","))
+        .map(String::trim).filter(s -> 
!s.isEmpty()).collect(Collectors.toSet());
+    for (MetadataPartitionType partitionType : indexesToDrop) {
+      String partitionPath = partitionType.getPartitionPath();
+      if (inflightIndexes.contains(partitionPath)) {
+        LOG.error("Metadata indexing in progress: " + partitionPath);
+        return;
+      }
+      LOG.warn("Deleting Metadata Table partitions: " + partitionPath);
+      dataMetaClient.getFs().delete(new 
Path(metadataWriteConfig.getBasePath(), partitionPath), true);
+      completedIndexes.remove(partitionPath);
+    }
+    // update table config
+    
dataMetaClient.getTableConfig().setValue(HoodieTableConfig.TABLE_METADATA_INDEX_COMPLETED.key(),
 String.join(",", completedIndexes));

Review comment:
       > should we not first update the table config and then delete the 
partitions
   
   yes yes good catch! i did fix this, not sure if i missed while rebasing.
   
   > Other writes who are holding on to an in memory table property are not 
going to get an updated value if we update here.
   
   Your idea is good but waiting for a minute only reduces the probability of 
failure. 
   Also note that, index is being dropped within a lock. I think drop index is 
not something which user would do very frequently. 
   
   To support fully conurrent writes, I know mysql lazily drops the index i.e. 
simply mark the current index as deleted and physically delete later whenever 
no other writer isr referencing the index. We can do something similar. 
Tracking here https://issues.apache.org/jira/browse/HUDI-3718




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to