rkrishn7 commented on issue #13510:
URL: https://github.com/apache/datafusion/issues/13510#issuecomment-2626258439

   Hello! I've dug into this issue a bit and it seems that the problem here 
arises from the fact that the table name for each table scan in the plan 
defaults to `UNNAMED_TABLE` (`"?table?"`). 
   
   Because of this, column references in the "on" portion of the join node are 
ambiguous. In fact, if we dedup the references by either changing column names 
or qualifying at least one table in the example with an alias, it works because 
type coercion is able to determine that a cast is necessary. Without this, the 
data type lookup for the columns during type coercion will yield that 
associated with the first matched field in the schema (the left-hand side). 
Thus, there appears to be no cast necessary.
   
   Other join types (e.g. inner) fail earlier in planning with a 
`DuplicateQualifiedField` error because they compute the _joined_ schema, 
performing duplicate name checks as a result. I'm thinking a good path forward 
here may be to perform this validation for all join types? That way we fail 
earlier in planning.
   
   Would love some feedback/thoughts! Thanks!
   
   (FYI @alamb I tested #13370 rebased off the latest main and it does not fix 
the issue. But this is expected based on the description above! 😅 )
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to