manojpec commented on a change in pull request #4352: URL: https://github.com/apache/hudi/pull/4352#discussion_r780117989
########## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/bloom/HoodieBloomIndex.java ########## @@ -111,13 +124,14 @@ public HoodieBloomIndex(HoodieWriteConfig config, BaseHoodieBloomIndexHelper blo private HoodiePairData<HoodieKey, HoodieRecordLocation> lookupIndex( HoodiePairData<String, String> partitionRecordKeyPairs, final HoodieEngineContext context, final HoodieTable hoodieTable) { - // Obtain records per partition, in the incoming records + // Step 1: Obtain records per partition, in the incoming records Map<String, Long> recordsPerPartition = partitionRecordKeyPairs.countByKey(); List<String> affectedPartitionPathList = new ArrayList<>(recordsPerPartition.keySet()); // Step 2: Load all involved files as <Partition, filename> pairs - List<Pair<String, BloomIndexFileInfo>> fileInfoList = - loadInvolvedFiles(affectedPartitionPathList, context, hoodieTable); + List<Pair<String, BloomIndexFileInfo>> fileInfoList = (config.getMetadataConfig().isMetaIndexColumnStatsEnabled() Review comment: Please take a look at the latest revision. Now, the bloom index and col stat index keys includes the full file name instead of just the file id part. Full file name are needed so that base file revisions are captured in the index properly. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org