adriangb commented on code in PR #16014:
URL: https://github.com/apache/datafusion/pull/16014#discussion_r2085377284


##########
datafusion/physical-optimizer/src/pruning.rs:
##########
@@ -995,6 +996,184 @@ fn build_statistics_record_batch<S: PruningStatistics>(
     })
 }
 
+/// Prune a set of containers represented by their statistics.

Review Comment:
   > Pruning on statistics during plan time would potentially be redundant with 
also trying to prune again during opening, but it would reduce the files 
earlier int he plan
   
   Yeah I don't think it's redundant: you either prune or you don't. If we 
prune earlier the files don't make it this far. If we don't we may now be able 
to prune them. What's redundant is if there are no changes to the filters (i.e. 
no dynamic filters), but that sounds both hard to track and like a possible 
future optimization 😄 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to