findepi commented on issue #1468: URL: https://github.com/apache/datafusion/issues/1468#issuecomment-2488702713
Schema / DFSchema has dual purpose and this is related to Expr::Alias existence and handling. One necessary purpose is understanding source tables, and identifier resolution in the SQL queries - a necessary job to be done once SQL syntax tree (AST) is produced by the parser. The other purpose is for column/symbol/variable resolution between the logical plan components. Some engines (like Trino) use separate abstraction for that purpose. Reusing schema here makes it very hard to solve https://github.com/apache/datafusion/issues/13476, https://github.com/apache/datafusion/issues/6543 (see WIP https://github.com/apache/datafusion/pull/13489#issuecomment-2488695180) I know addressing this is hard, but in 5 year from now perspective, would we prefer these issues to be still with us? Or would we prefer to address them sooner or later? I think we should refactor LogicalPlans not to use aliases and column references at all. This can fall under https://github.com/apache/datafusion/issues/12723. We should find a way to do this incrementally, and I believe there is a way to do so. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
