Mark1626 commented on issue #18922:
URL: https://github.com/apache/datafusion/issues/18922#issuecomment-3594303681

   I was using `PartitionStatistics` in isolation and I noticed that it's 
pruning all the files, `PartitionStatistics` is incorrectly marking the files 
to be pruned. Let's say all the available partitions for `ss_sold_date_sk` are 
`[(2451000), (2451001), (2451002), (245103)]`. The current behaviour of 
`contained` of `PartitionStatistics` is
   1.  returns `[true]` for `select ss_list_price from store_sales where 
ss_sold_date_sk = 2451000`
   2. returns `[false, false]` for `select ss_list_price from store_sales where 
ss_sold_date_sk in (2451000, 2451001)`
   
   The second scenario should return `[true, true]`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to