codope commented on a change in pull request #4693:
URL: https://github.com/apache/hudi/pull/4693#discussion_r835754540
##########
File path:
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java
##########
@@ -621,8 +635,14 @@ private void initializeFileGroups(HoodieTableMetaClient
dataMetaClient, Metadata
LOG.info(String.format("Creating %d file groups for partition %s with base
fileId %s at instant time %s",
fileGroupCount, metadataPartition.getPartitionPath(),
metadataPartition.getFileIdPrefix(), instantTime));
+ HoodieTableFileSystemView fsView =
HoodieTableMetadataUtil.getFileSystemView(metadataMetaClient);
+ List<FileSlice> fileSlices =
HoodieTableMetadataUtil.getPartitionLatestFileSlices(metadataMetaClient,
Option.ofNullable(fsView), metadataPartition.getPartitionPath());
for (int i = 0; i < fileGroupCount; ++i) {
final String fileGroupFileId = String.format("%s%04d",
metadataPartition.getFileIdPrefix(), i);
+ // if a writer or async indexer had already initialized the filegroup
then continue
+ if (!fileSlices.isEmpty() && fileSlices.stream().anyMatch(fileSlice ->
fileGroupFileId.equals(fileSlice.getFileGroupId().getFileId()))) {
Review comment:
With initialization happening while scheduling index, we should not get
into this case anymore.
> Ideally initializeFileGroups should be called just once per MDT partition
right ?or am I missing something.
That's correct.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]