bvaradar commented on a change in pull request #2048:
URL: https://github.com/apache/hudi/pull/2048#discussion_r490589908



##########
File path: 
hudi-client/src/main/java/org/apache/hudi/table/HoodieTimelineArchiveLog.java
##########
@@ -301,6 +304,61 @@ private void deleteAnyLeftOverMarkerFiles(JavaSparkContext 
jsc, HoodieInstant in
     }
   }
 
+  private void deleteReplacedFiles(HoodieInstant instant) {
+    if (!instant.isCompleted()) {
+      // only delete files for completed instants
+      return;
+    }
+
+    TableFileSystemView fileSystemView = this.table.getFileSystemView();
+    ensureReplacedPartitionsLoadedCorrectly(instant, fileSystemView);
+
+    Stream<HoodieFileGroup> fileGroupsToDelete = fileSystemView

Review comment:
       @satishkotha Here is the plan as we discussed.
   1. Change the signature of fileSystemView.getReplacedFileGroupsBeforeOrOn to 
also take in partitionId
   2.  In HoodieTimelineArchiveLog.deleteReplacedFileGroups, read the replace 
metadata (which we are already doing) and for each partition, call 
fileSystemView.getReplacedFileGroupsBeforeOrOn().
   3. (2) must be done in such a way that we are calling the 
fileSystemView.getReplacedFileGroupsBeforeOrOn in parallel.
   
   This should allow for lazy loading semantics to be retained at file-system 
view.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to