alamb commented on issue #12604: URL: https://github.com/apache/datafusion/issues/12604#issuecomment-2377859743
> let Expr carry the type explicitly (that would be "logical type" I think this is the > correct, unless get_type is called more than once. optimizer rules may check types. DF doesn't have as many optimizer rules yet, but I am basing this on observation and profiling of Trino optimi FWIW I don't think i have seen get_type appear in the planning profiling traces (this may be because the overhead is mostly dominated by copies so the type calculations are overshadowed. There are some times when the type of an intermedidate expression is needed (like the simplifier) but it doesn't seem to have been a bottleneck so far > to make this large change it might be a very good reason for doing it. I think we need to list the exact benefits of having this refactoring I agree with @comphead -- I think this propsoal would be stronger if it was in the context of 1. I am trying to do `X` but the current way types are represented makes it not possible 2. I am trying to do `Y` but profiling shows a large amount of time being spent recomputing unecessary types In other words, if we were designing a system from scratch this proposal makes sense to me, but I think at this point we already have a system that seems to work reasonably well for our existing yers > Eg different coercions between types. To be even more concrete with example, Trino and SQL Server has different rules for coercing decimals of different precision and scale. I think making an API for user-defined coercion rules in DataFusion would be really nice (not just to support the usecase, but I think forcing the internal implementaiton into a thoughtful API would make the code easier to reason about and better strutured) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
