Mark1626 commented on issue #18922: URL: https://github.com/apache/datafusion/issues/18922#issuecomment-3594303681
I was using `PartitionStatistics` in isolation and I noticed that it's pruning all the files, `PartitionStatistics` is incorrectly marking the files to be pruned. Let's say all the available partitions for `ss_sold_date_sk` are `[(2451000), (2451001), (2451002), (245103)]`. The current behaviour of `contained` of `PartitionStatistics` is 1. returns `[true]` for `select ss_list_price from store_sales where ss_sold_date_sk = 2451000` 2. returns `[false, false]` for `select ss_list_price from store_sales where ss_sold_date_sk in (2451000, 2451001)` The second scenario should return `[true, true]` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
