adriangb commented on code in PR #22450:
URL: https://github.com/apache/datafusion/pull/22450#discussion_r3409746016
##########
datafusion/datasource-parquet/src/opener/mod.rs:
##########
@@ -1419,16 +1412,40 @@ impl RowGroupsPrunedParquetOpen {
let files_ranges_pruned_statistics =
prepared.file_metrics.files_ranges_pruned_statistics.clone();
+
+ // Build a dynamic row-group pruner only when both halves are useful:
+ // 1) the scan has a predicate (so there is something to evaluate),
+ // 2) there is at least one pending run that could be skipped.
+ // The pruner consults the predicate's `snapshot_generation` so the
+ // cost is one rebuild per dynamic-filter update, not per RG check.
Review Comment:
Is this comment about `snapshot_generation` still current? I thought
https://github.com/apache/datafusion/pull/22460 superseeded that.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]