jonahgao opened a new issue, #14118: URL: https://github.com/apache/datafusion/issues/14118
### Describe the bug ## Background Select with wildcard over a USING/NATURAL JOIN should deduplicate join columns. For example: ```sql with t as(select 1 as a, 2 as b) select * from t t1 join t t2 using(a) ``` This query above should output the column 'a' only once. `LogicalPlan::using_columns()` is used to find these join columns and to help exclude duplicated columns when expanding wildcards. ## Problem [using_columns()](https://github.com/apache/datafusion/blob/f9cc3325cdb5891b7566a6f3503c1f7ac6ad51e0/datafusion/expr/src/logical_plan/plan.rs#L485) works by traversing the plan tree. This manner might be unsafe as it could incorrectly find columns that are not relevant to the current SQL context. This may lead to some output columns being incorrectly excluded. For example, the result of the query below is different from other databases. ```sql create table t(a int); insert into t values(1),(2),(3); select * from (select t.a+2 as a from t join t t2 using(a)) as t2; ````` ### To Reproduce Run query in CLI. ``` > create table t(a int); insert into t values(1),(2),(3); select * from (select t.a+2 as a from t join t t2 using(a)) as t2; 0 row(s) fetched. Elapsed 0.008 seconds. +-------+ | count | +-------+ | 3 | +-------+ 1 row(s) fetched. Elapsed 0.015 seconds. ++ ++ ++ 3 row(s) fetched. Elapsed 0.013 seconds. ``` It outputs no columns. ### Expected behavior In PostgreSQL it does output one column. ``` psql (16.6 (Ubuntu 16.6-0ubuntu0.24.04.1)) Type "help" for help. psql=> create table t(a int); insert into t values(1),(2),(3); select * from (select t.a+2 as a from t join t t2 using(a)) as t2; CREATE TABLE INSERT 0 3 a --- 3 4 5 (3 rows) ``` ### Additional context _No response_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org