[I] [EPIC] expression pushdown and file level expression handling [datafusion]

via GitHub Wed, 25 Jun 2025 12:34:15 -0700


adriangb opened a new issue, #16528:
URL: https://github.com/apache/datafusion/issues/16528


   Several feature requests / issues have come up that I think can all be 
addressed with the groundwork being laid in 
https://github.com/apache/datafusion/pull/16461:
   - https://github.com/apache/datafusion/pull/15261
   - https://github.com/apache/datafusion/pull/15057
   - https://github.com/apache/datafusion/issues/16004
   - https://github.com/apache/datafusion/issues/14993
   
   In particular, https://github.com/apache/datafusion/pull/16461 introduces a 
general framework for adapting an expression to a file's schema, handling any 
necessary casts and missing columns.
   We can expand this by:
   
   - Optimizing the expressions to minimize cost of casts, wip in 
https://github.com/pydantic/datafusion/pull/31. Closes 
https://github.com/apache/datafusion/issues/16004.
   - Other optimizations passes, such as evaluating literals / nulls. Also 
related to https://github.com/apache/datafusion/issues/16004.
   - Hook to handle missing columns (e.g. do something other than fill in with 
nulls based on Field metadata, could be a user defined default value); closes 
https://github.com/apache/datafusion/pull/15261
   - Hook to transform an expression before or after it is rewritten for the 
physical file schema; closes https://github.com/apache/datafusion/pull/15057.
   - Optimization to eliminate casts altogether when two types share the same 
parquet physical type (change the schema the data is read with and remove the 
cast).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

[I] [EPIC] expression pushdown and file level expression handling [datafusion]

Reply via email to