YuweiXiao commented on a change in pull request #4540:
URL: https://github.com/apache/hudi/pull/4540#discussion_r783929268



##########
File path: 
hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/utils/HoodieRealtimeInputFormatUtils.java
##########
@@ -94,16 +94,16 @@
         HoodieTableMetaClient metaClient = 
partitionsToMetaClient.get(partitionPath);
         if (!fsCache.containsKey(metaClient)) {
           HoodieLocalEngineContext engineContext = new 
HoodieLocalEngineContext(conf);
-          HoodieTableFileSystemView fsView = 
FileSystemViewManager.createInMemoryFileSystemView(engineContext,
-              metaClient, HoodieInputFormatUtils.buildMetadataConfig(conf));
+          HoodieTableFileSystemView fsView = 
FileSystemViewManager.createInMemoryFileSystemViewWithTimeline(engineContext,
+              metaClient, HoodieInputFormatUtils.buildMetadataConfig(conf), 
metaClient.getActiveTimeline());
           fsCache.put(metaClient, fsView);
         }
         HoodieTableFileSystemView fsView = fsCache.get(metaClient);
 
         String relPartitionPath = FSUtils.getRelativePartitionPath(new 
Path(metaClient.getBasePath()), partitionPath);
         // Both commit and delta-commits are included - pick the latest 
completed one
         Option<HoodieInstant> latestCompletedInstant =
-            
metaClient.getActiveTimeline().getCommitsTimeline().filterCompletedInstants().lastInstant();
+            
metaClient.getActiveTimeline().getWriteTimeline().filterCompletedInstants().lastInstant();

Review comment:
       > I also don't understand the fix. can you help throw some light. From 
the description in this patch, the gap is, when compaction is on-going and a 
new write comes in and completes, it may not be visible to queries. But the fix 
here, just includes compaction instants to the list of instants to process. Not 
sure if the description matches the fix. or am I missing anything here.
   
   Hey! In `fsView::getLatestMergedFileSlicesBeforeOrOn`, there is a logic 
where we check if a file group is under compaction (under construction), so 
that we could add logs files generated by concurrent writers. And only passing 
a timeline including compactions, this logic could work 
(`fsView::fetchMergedFileSlice`).
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to