friendlymatthew opened a new pull request, #20854:
URL: https://github.com/apache/datafusion/pull/20854

   ## Which issue does this PR close?
   
   - Related https://github.com/apache/datafusion/pull/20822
   - Closes https://github.com/apache/datafusion/issues/20603
   
   ## Rationale for this change
   
   This PR refines how the `FilterCandidateBuilder` projects struct columns 
during Parquet row filter pushdown. 
   
   Previously, a filter like `s['value'] > 10` would cause the reader to decode 
all leaf columns of a struct `s`, because `PushdownChecker` only tracked the 
root column index and expanded it to every leaf. This wastes I/O and decode 
time on fields the filter never touches
   
   Now, the builder resolves only the matching Parquet leaf columns. It does 
this by building a pruned filter schema that reflects exactly what the Parquet 
reader produces when projecting a subset of struct leaves, ensuring the 
expression evaluates against the correct types


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to