jonahgao opened a new issue, #14118:
URL: https://github.com/apache/datafusion/issues/14118

   ### Describe the bug
   
   ## Background
   Select with wildcard over a USING/NATURAL JOIN should deduplicate join 
columns.
   For example: 
   ```sql
   with t as(select 1 as a, 2 as b) select * from t t1 join t t2 using(a)
   ``` 
   This query above should output the column 'a' only once. 
   
   `LogicalPlan::using_columns()` is used to find these join columns and to 
help exclude duplicated columns when expanding wildcards.
   
   ## Problem
   
[using_columns()](https://github.com/apache/datafusion/blob/f9cc3325cdb5891b7566a6f3503c1f7ac6ad51e0/datafusion/expr/src/logical_plan/plan.rs#L485)
 works by traversing the plan tree. This manner might be unsafe as it could 
incorrectly find columns that are not relevant to the current SQL context. This 
may lead to some output columns being incorrectly excluded.
   
   For example, the result of the query below is different from other databases.
   ```sql
   create table t(a int);
   insert into t values(1),(2),(3);
   select * from (select t.a+2 as a from t join t t2 using(a)) as t2;
   `````
   
   
   ### To Reproduce
   
   Run query in CLI.
   ```
   > create table t(a int);
   insert into t values(1),(2),(3);
   select * from (select t.a+2 as a from t join t t2 using(a)) as t2;
   0 row(s) fetched.
   Elapsed 0.008 seconds.
   
   +-------+
   | count |
   +-------+
   | 3     |
   +-------+
   1 row(s) fetched.
   Elapsed 0.015 seconds.
   
   ++
   ++
   ++
   3 row(s) fetched.
   Elapsed 0.013 seconds.
   ```
   It outputs no columns.
   
   ### Expected behavior
   
   In PostgreSQL it does output one column.
   ```
   psql (16.6 (Ubuntu 16.6-0ubuntu0.24.04.1))
   Type "help" for help.
   
   psql=> create table t(a int);
   insert into t values(1),(2),(3);
   select * from (select t.a+2 as a from t join t t2 using(a)) as t2;
   CREATE TABLE
   INSERT 0 3
    a
   ---
    3
    4
    5
   (3 rows)
   ```
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to