alamb opened a new issue, #14335:
URL: https://github.com/apache/datafusion/issues/14335

   ### Describe the bug
   
   `LIMIT`s are incorrectly pushed through `FULL OUTER` Joins
   
   ### To Reproduce
   
   ```sql
   COPY (values (1), (2), (3), (4), (5))  TO '/tmp/t1.csv' STORED AS CSV;
   -- store t2 in different order so the top N rows are not the same as the top 
N rows of t1
   COPY (values (5), (4), (3), (2), (1))  TO '/tmp/t2.csv' STORED AS CSV;
   
   statement ok
   create external table t1(a int) stored as CSV location '/tmp/t1.csv';
   
   statement ok
   create external table t2(b int) stored as CSV location '/tmp/t2.csv';
   
   -- FULL join produces 5 rows (all have matches)
   select * from t1 FULL JOIN t2 ON t1.a = t2.b;
   /*
   +---+---+
   | a | b |
   +---+---+
   | 1 | 1 |
   | 2 | 2 |
   | 3 | 3 |
   | 4 | 4 |
   | 5 | 5 |
   +---+---+
   */
   
   -- the output of this query should be two rows from the previous query, there
   -- should be no nulls
   select * from t1 FULL JOIN t2 ON t1.a = t2.b LIMIT 2;
   
   /*
   +------+------+
   | a    | b    |
   +------+------+
   | 1    | NULL |
   | NULL | 4    |
   +------+------+
   */
   
   ```
   
   ### Expected behavior
   
   Any two rows from 
   ```sql
   +---+---+
   | a | b |
   +---+---+
   | 1 | 1 |
   | 2 | 2 |
   | 3 | 3 |
   | 4 | 4 |
   | 5 | 5 |
   +---+---+
   ```
   
   There should be no nulls introduced
   
   ### Additional context
   
   Found while working with @zhuqi-lucas on:
   - https://github.com/apache/datafusion/pull/14245


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to