codope commented on a change in pull request #4693:
URL: https://github.com/apache/hudi/pull/4693#discussion_r835754540



##########
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java
##########
@@ -621,8 +635,14 @@ private void initializeFileGroups(HoodieTableMetaClient 
dataMetaClient, Metadata
 
     LOG.info(String.format("Creating %d file groups for partition %s with base 
fileId %s at instant time %s",
         fileGroupCount, metadataPartition.getPartitionPath(), 
metadataPartition.getFileIdPrefix(), instantTime));
+    HoodieTableFileSystemView fsView = 
HoodieTableMetadataUtil.getFileSystemView(metadataMetaClient);
+    List<FileSlice> fileSlices = 
HoodieTableMetadataUtil.getPartitionLatestFileSlices(metadataMetaClient, 
Option.ofNullable(fsView), metadataPartition.getPartitionPath());
     for (int i = 0; i < fileGroupCount; ++i) {
       final String fileGroupFileId = String.format("%s%04d", 
metadataPartition.getFileIdPrefix(), i);
+      // if a writer or async indexer had already initialized the filegroup 
then continue
+      if (!fileSlices.isEmpty() && fileSlices.stream().anyMatch(fileSlice -> 
fileGroupFileId.equals(fileSlice.getFileGroupId().getFileId()))) {

Review comment:
       With  initialization happening while scheduling index, we should not get 
into this case anymore.
   > Ideally initializeFileGroups should be called just once per MDT partition 
right ?or am I missing something.
   
   That's correct.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to