friendlymatthew opened a new pull request, #20854: URL: https://github.com/apache/datafusion/pull/20854
## Which issue does this PR close? - Related https://github.com/apache/datafusion/pull/20822 - Closes https://github.com/apache/datafusion/issues/20603 ## Rationale for this change This PR refines how the `FilterCandidateBuilder` projects struct columns during Parquet row filter pushdown. Previously, a filter like `s['value'] > 10` would cause the reader to decode all leaf columns of a struct `s`, because `PushdownChecker` only tracked the root column index and expanded it to every leaf. This wastes I/O and decode time on fields the filter never touches Now, the builder resolves only the matching Parquet leaf columns. It does this by building a pruned filter schema that reflects exactly what the Parquet reader produces when projecting a subset of struct leaves, ensuring the expression evaluates against the correct types -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
