rkrishn7 commented on PR #17452:
URL: https://github.com/apache/datafusion/pull/17452#issuecomment-3271839180

   > > @adriangb I think there is opportunity to simplify the bounds collection 
for each partition. That is, we can probably just track the min/max across all 
partitions and build a single `AND` binary expr once we have the final min/max 
(i.e. all partition bounds have been reported).
   > > Aside from one less mutex, I think it'll help reduce output in `EXPLAIN` 
as well. Happy to tackle in a follow-up PR
   > 
   > I think that will regress performance: imagine partition 1 has bounds (0, 
1) and partition 2 has bounds (999998, 999999). With bounds per partition the 
value 1234 is filtered out. The merged bounds of (0, 999999) would include that 
value.
   
   Ah yes 🤦🏾 , definitely. Good catch!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to