rkrishn7 commented on PR #17452: URL: https://github.com/apache/datafusion/pull/17452#issuecomment-3271839180
> > @adriangb I think there is opportunity to simplify the bounds collection for each partition. That is, we can probably just track the min/max across all partitions and build a single `AND` binary expr once we have the final min/max (i.e. all partition bounds have been reported). > > Aside from one less mutex, I think it'll help reduce output in `EXPLAIN` as well. Happy to tackle in a follow-up PR > > I think that will regress performance: imagine partition 1 has bounds (0, 1) and partition 2 has bounds (999998, 999999). With bounds per partition the value 1234 is filtered out. The merged bounds of (0, 999999) would include that value. Ah yes 🤦🏾 , definitely. Good catch! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org