adriangb commented on code in PR #16014: URL: https://github.com/apache/datafusion/pull/16014#discussion_r2094127261
########## datafusion/datasource-parquet/src/opener.rs: ########## @@ -111,19 +120,61 @@ impl FileOpener for ParquetOpener { .create(projected_schema, Arc::clone(&self.table_schema)); let predicate = self.predicate.clone(); let table_schema = Arc::clone(&self.table_schema); + let partition_fields = self.partition_fields.clone(); let reorder_predicates = self.reorder_filters; let pushdown_filters = self.pushdown_filters; let coerce_int96 = self.coerce_int96; let enable_bloom_filter = self.enable_bloom_filter; let enable_row_group_stats_pruning = self.enable_row_group_stats_pruning; let limit = self.limit; - let predicate_creation_errors = MetricBuilder::new(&self.metrics) - .global_counter("num_predicate_creation_errors"); - let enable_page_index = self.enable_page_index; Ok(Box::pin(async move { + // Prune this file using the file level statistics. Review Comment: > which can be set to true if there are dynamic predicates present the issue is: how do we know the filters are dynamic? we've hidden dynamic filters behind `PhysicalExpr` so that the system can treat them as normal filters. we could do _any_ filter pushdown but that doesn't seem like much of an improvement. I also think this pruning should be quite cheap / the record batches being filtered are just a couple rows -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org