nsivabalan commented on code in PR #13295:
URL: https://github.com/apache/hudi/pull/13295#discussion_r2112563984
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java:
##########
@@ -1481,18 +1487,21 @@ protected HoodieData<HoodieRecord>
prepRecords(Map<String, HoodieData<HoodieReco
ValidationUtils.checkArgument(fileGroupCount > 0,
String.format("FileGroup count for MDT partition %s should be > 0",
partitionName));
List<FileSlice> finalFileSlices = fileSlices;
+ Set<String> mappedFileIds = new HashSet<>();
HoodieData<HoodieRecord> rddSinglePartitionRecords = records.map(r -> {
FileSlice slice =
finalFileSlices.get(HoodieTableMetadataUtil.mapRecordKeyToFileGroupIndex(r.getRecordKey(),
fileGroupCount));
r.unseal();
r.setCurrentLocation(new
HoodieRecordLocation(slice.getBaseInstantTime(), slice.getFileId()));
r.seal();
+ mappedFileIds.add(slice.getFileId());
Review Comment:
good catch. I did fix it in my later patch, but missed to update here.
https://github.com/apache/hudi/pull/13312/files#r2112561019
you can check it out here.
there is a known limitation to this. we will be spinning up a spark task
while writing for all file groups for the mdt partitions we are touching, since
w/o triggering the action on the records, we don't really know which among the
file groups actually have records to be written to.
but since we do not want to trigger any action, we are returning every file
group in the partiions we touch from here.
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java:
##########
@@ -1481,18 +1487,21 @@ protected HoodieData<HoodieRecord>
prepRecords(Map<String, HoodieData<HoodieReco
ValidationUtils.checkArgument(fileGroupCount > 0,
String.format("FileGroup count for MDT partition %s should be > 0",
partitionName));
List<FileSlice> finalFileSlices = fileSlices;
+ Set<String> mappedFileIds = new HashSet<>();
HoodieData<HoodieRecord> rddSinglePartitionRecords = records.map(r -> {
FileSlice slice =
finalFileSlices.get(HoodieTableMetadataUtil.mapRecordKeyToFileGroupIndex(r.getRecordKey(),
fileGroupCount));
r.unseal();
r.setCurrentLocation(new
HoodieRecordLocation(slice.getBaseInstantTime(), slice.getFileId()));
r.seal();
+ mappedFileIds.add(slice.getFileId());
Review Comment:
will fix this patch accordingly.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]