yihua commented on code in PR #13647:
URL: https://github.com/apache/hudi/pull/13647#discussion_r2248416662


##########
hudi-common/src/main/java/org/apache/hudi/metadata/BaseTableMetadata.java:
##########
@@ -366,39 +359,43 @@ Map<String, List<StoragePathInfo>> 
fetchAllFilesInPartitionPaths(List<StoragePat
   }
 
   /**
-   * Computes a map from col-stats key to partition and file name pair.
+   * Computes raw keys and metadata for column stats lookup.
    *
    * @param partitionNameFileNameList - List of partition and file name pair 
for which bloom filters need to be retrieved.
    * @param columnNames - List of column name for which stats are needed.
+   * @return Pair of raw keys list and a map from encoded key to 
partition/file pair
    */
-  private Map<String, Pair<String, String>> computeColStatKeyToFileName(
+  private Pair<List<ColumnStatsIndexRawKey>, Map<String, Pair<String, 
String>>> computeColStatRawKeys(

Review Comment:
   Why is the API changed to return raw keys as well?



##########
hudi-common/src/main/java/org/apache/hudi/metadata/BaseTableMetadata.java:
##########
@@ -261,16 +251,16 @@ public Map<Pair<String, String>, 
List<HoodieMetadataColumnStats>> getColumnStats
       return Collections.emptyMap();
     }
 
-    Map<String, Pair<String, String>> columnStatKeyToFileNameMap = 
computeColStatKeyToFileName(partitionNameFileNameList, columnNames);
-    return computeFileToColumnStatsMap(columnStatKeyToFileNameMap);
+    Pair<List<ColumnStatsIndexRawKey>, Map<String, Pair<String, String>>> 
rawKeysAndMap = computeColStatRawKeys(partitionNameFileNameList, columnNames);
+    return computeFileToColumnStatsMap(rawKeysAndMap.getLeft(), 
rawKeysAndMap.getRight());
   }
 
   /**
    * Returns a list of all partitions.
    */
   protected List<String> fetchAllPartitionPaths() {
     HoodieTimer timer = HoodieTimer.start();
-    Option<HoodieRecord<HoodieMetadataPayload>> recordOpt = 
getRecordByKey(RECORDKEY_PARTITION_LIST,
+    Option<HoodieRecord<HoodieMetadataPayload>> recordOpt = 
readFilesIndexRecords(RECORDKEY_PARTITION_LIST,

Review Comment:
   If the method is reading records from FILES partition, there is no need to 
pass in `MetadataPartitionType.FILES.getPartitionPath()`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to