adriangb opened a new issue, #13836: URL: https://github.com/apache/datafusion/issues/13836
### Is your feature request related to a problem or challenge? Are there any scenarios where it makes sense for each column in a container to have a different row count? I think they should always be the same. Even if they are stored separately in Parquet we should be able to pick any non-missing row count and have it be correct. If this is true we can simplify the pruning predicate a little bit which would make it (possibly insignificantly) faster to evaluate for everyone using DataFusion but selfishly would allow me to remove a couple lines of hacky code in our codebase. https://github.com/apache/datafusion/blob/46101f3d195d1f8b483e13f2d19485e04070e0b0/datafusion/physical-optimizer/src/pruning.rs#L843 ### Describe the solution you'd like `PruningPredicate` has the option to be configured to only reference a single column called `row_count`. ### Describe alternatives you've considered Do nothing. ### Additional context _No response_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org