zhangyue19921010 commented on code in PR #13060: URL: https://github.com/apache/hudi/pull/13060#discussion_r2025037864
########## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/source/prune/PrimaryKeyPruners.java: ########## @@ -45,7 +45,7 @@ public class PrimaryKeyPruners { public static final int BUCKET_ID_NO_PRUNING = -1; - public static int getBucketId(List<ResolvedExpression> hashKeyFilters, Configuration conf) { + public static int getBucketFieldHashing(List<ResolvedExpression> hashKeyFilters, Configuration conf) { Review Comment: Sorry Danny, I didn't get this. Is that possible to get full partition path during original dataBucket computation? ``` @Override public Result applyFilters(List<ResolvedExpression> filters) { List<ResolvedExpression> simpleFilters = filterSimpleCallExpression(filters); Tuple2<List<ResolvedExpression>, List<ResolvedExpression>> splitFilters = splitExprByPartitionCall(simpleFilters, this.partitionKeys, this.tableRowType); this.predicates = ExpressionPredicates.fromExpression(splitFilters.f0); this.columnStatsProbe = ColumnStatsProbe.newInstance(splitFilters.f0); this.partitionPruner = createPartitionPruner(splitFilters.f1, columnStatsProbe); this.dataBucket = getDataBucket(splitFilters.f0); // refuse all the filters now return SupportsFilterPushDown.Result.of(new ArrayList<>(splitFilters.f1), new ArrayList<>(filters)); } ``` What is PR did is get and pass hashing value to `getFilesInPartitions`, then compute numBuckets , finally compute the final bucket id `hashing value % numBuckets` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org