Anton Kovalevsky created CALCITE-6804:
-----------------------------------------

             Summary: Anti-join with WHERE NOT EXISTS syntax has corrupted 
condition
                 Key: CALCITE-6804
                 URL: https://issues.apache.org/jira/browse/CALCITE-6804
             Project: Calcite
          Issue Type: Bug
            Reporter: Anton Kovalevsky


Queries like:
{code:sql}
SELECT * FROM
(
  SELECT field1 FROM table1 JOIN table2 ON table1.field1 = table2.field2
) selected
WHERE NOT EXISTS (select 1 from table3 where table3.field3 = selected.field1)
{code}
are being converted into 

{code:sql}
SELECT * FROM
(
  SELECT field1 FROM table1 JOIN table2 ON table1.field1 = table2.field2
) selected
WHERE NOT EXISTS (select 1 from table3 where table3.field3 = 
table2.<random_field>)
{code}



Example I added to RelToSqlConverterTest
{code:java}
  @Test void testAntiJoinWithWhereNotExists() {
    final String sql = "SELECT * FROM (select * from (select 
e1.\"product_id\"\n"
        + "FROM \"foodmart\".\"product\" e1 LEFT JOIN \"foodmart\".\"product\" 
e3 on e1.\"product_id\" = e3.\"product_id\") s where true) selected where not 
exists\n"
        + "(select 1 from \"foodmart\".\"product\" e2 where e2.\"product_id\" = 
selected.\"product_id\")";


    final String expected = "SELECT *\n" +
        "FROM (SELECT \"product\".\"product_id\"\n" +
        "FROM \"foodmart\".\"product\"\n" +
        "LEFT JOIN \"foodmart\".\"product\" AS \"product0\" ON 
\"product\".\"product_id\" = \"product0\".\"product_id\"" +
        ") AS \"t\"\n" +
        "WHERE EXISTS (SELECT *\nFROM \"foodmart\".\"product\"\nWHERE 
\"product_id\" = \"t\".\"product_class_id\")"
        ;
    sql(sql).ok(expected);
  }
{code}
{code:java}
Expected: is "SELECT ...) AS \"t\"\nWHERE EXISTS (... WHERE \"product_id\" = 
\"t\".\"product_id\")"
    but: was "SELECT ...) AS \"t\"\nWHERE EXISTS (... WHERE \"product_id\" = 
\"product1\".\"product_class_id\")"
{code}
*product1* is generated alias for a query from one of sub-queries, and 
*product_class_id* is a field from that misused table scan.

My high level understanding is that query with WHERE NOT EXISTS syntax is 
considered as LogicalFilter and appropriate pieces of code (like 
*AliasReplacementShuttle* and {*}visitAntiOrSemiJoin{*}) are not invoked.

And visit of Filter node builds alias context inappropriately.

Directions I am trying:
 - Duplicate antiJoin visit under the scope of filter visit.
 - Explicit rule to convert Filter to Join manually. Feels artificial because 
it planner should be triggered, it requires a convention.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to