alamb opened a new issue, #14987: URL: https://github.com/apache/datafusion/issues/14987
### Is your feature request related to a problem or challenge? As described by @ion-elgreco in https://github.com/apache/datafusion/issues/14944 Given a dataset with an `Int64` column named `month`, when a predicate such as the following is created ```sql month_id = '202502' ``` In sql / dataframe queries, this will be simplified to the following. Note the constant is cast and the column is not cast: ```sql month_id = cast('202502', 'Int64') ``` However, when using [`SessionContext::create_physical_expr`](https://github.com/apache/datafusion/blob/5e49094c159ce110bebd2bb6f4858ff515cd1860/datafusion-examples/examples/expr_api.rs#L540-L543) to create a physical expression directly, as is done in delta.rs and other systems like LanceDB, the expression looks like this (cast on the column) ```sql cast(month_id, 'Int64') = '202502' ``` This is bad for two reasons: 1. `PruningPredicate` can't handle this type of expression (and thus it can't be used to prune Parquet row groups) 2. Evaluating this filter is substantially slower as it has to apply a transformation to *all* values of `month_id` before it can evaluate the filter. And furthermore it does slow string comparison compared to faster int63 comparison The reason this happens is that the conversion from `cast(month_id, 'Int64') = '202502'` to `month_id = Cast('202502', Int64)` happens in the Analyzer, specifically here: https://github.com/apache/datafusion/blob/2fcab2ef0da474ec000d7410427b9d18afb5820b/datafusion/optimizer/src/unwrap_cast_in_comparison.rs#L39-L77 However, this pass is not run as part of `SessionContext::create_physical_expr` ### Describe the solution you'd like I would like the expressions crated by `SessionContext::create_physical_expr` to have had their casts unwrapped as well ### Describe alternatives you've considered THe ideal solution in my mind is to remove the entire Analyzer pass and instead do the unwrap in comparisons as part of the expression simplification https://docs.rs/datafusion/latest/datafusion/optimizer/simplify_expressions/expr_simplifier/struct.ExprSimplifier.html ### Additional context - https://github.com/delta-io/delta-rs/issues/3278 - https://github.com/apache/datafusion/issues/14944 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org