bvaradar commented on a change in pull request #2048:
URL: https://github.com/apache/hudi/pull/2048#discussion_r490589908
##########
File path:
hudi-client/src/main/java/org/apache/hudi/table/HoodieTimelineArchiveLog.java
##########
@@ -301,6 +304,61 @@ private void deleteAnyLeftOverMarkerFiles(JavaSparkContext
jsc, HoodieInstant in
}
}
+ private void deleteReplacedFiles(HoodieInstant instant) {
+ if (!instant.isCompleted()) {
+ // only delete files for completed instants
+ return;
+ }
+
+ TableFileSystemView fileSystemView = this.table.getFileSystemView();
+ ensureReplacedPartitionsLoadedCorrectly(instant, fileSystemView);
+
+ Stream<HoodieFileGroup> fileGroupsToDelete = fileSystemView
Review comment:
@satishkotha Here is the plan as we discussed.
1. Change the signature of fileSystemView.getReplacedFileGroupsBeforeOrOn to
also take in partitionId
2. In HoodieTimelineArchiveLog.deleteReplacedFileGroups, read the replace
metadata (which we are already doing) and for each partition, call
fileSystemView.getReplacedFileGroupsBeforeOrOn().
3. (2) must be done in such a way that we are calling the
fileSystemView.getReplacedFileGroupsBeforeOrOn in parallel.
This should allow for lazy loading semantics to be retained at file-system
view.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]