[ 
https://issues.apache.org/jira/browse/CALCITE-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yang updated CALCITE-6884:
------------------------------
    Description: 
The `RelToSqlConverter` class has a bug when converting a `Correlate` operator 
to SQL for MySQL dialect. When the left input of the correlate contains 
multiple tables (like a join), it generates invalid MySQL syntax by placing 
parentheses around the tables and applying an alias to the entire expression.

Steps to Reproduce



1. Using the TPC-H schema
2. Starting with this query:
SELECT SUM(l_extendedprice) / 7.0 AS avg_yearly 
FROM lineitem, part 
WHERE p_partkey = l_partkey 
  AND p_brand = 'Brand#23' 
  AND p_container = 'MED BOX' 
  AND l_quantity < (SELECT 0.2 * AVG(l_quantity) 
                    FROM lineitem 
                    WHERE l_partkey = p_partkey);

3. Apply the "{{{}FILTER_SUB_QUERY_TO_CORRELATE"{}}} rule

4. Convert the resulting RelNode to SQL using the MySQL dialect

 

Expected Behavior

The generated SQL should use syntax that MySQL supports for lateral joins, for 
example:
SELECT SUM(`$cor0`.`l_extendedprice`)  / 7.0 AS `avg_yearly`
FROM `lineitem`, `part`,
LATERAL (SELECT 0.2*AVG(`l_quantity`) AS `EXPR$0`
FROM `lineitem`
WHERE `l_partkey`=`p_partkey`)  AS `t2`
WHERE `p_partkey`=`l_partkey`
AND `p_brand`='Brand#23'
AND `p_container`='MED BOX'
AND `l_quantity`<`t2`.`EXPR$0`;


Actual Behavior
The converter generates this invalid SQL:

SELECT SUM(`$cor0`.`l_extendedprice`)  / 7.0 AS `avg_yearly`
FROM (`lineitem`, `part`) AS `$cor0`,
LATERAL (SELECT 0.2*AVG(`l_quantity`) AS `EXPR$0`
FROM `lineitem`
WHERE `l_partkey`=`$cor0`.`p_partkey`)  AS `t2`
WHERE `$cor0`.`p_partkey`=`$cor0`.`l_partkey`
AND `$cor0`.`p_brand`='Brand#23'
AND `$cor0`.`p_container`='MED BOX'
AND `$cor0`.`l_quantity`<`t2`.`EXPR$0`;

  was:
The `RelToSqlConverter` class has a bug when converting a `Correlate` operator 
to SQL for MySQL dialect. When the left input of the correlate contains 
multiple tables (like a join), it generates invalid MySQL syntax by placing 
parentheses around the tables and applying an alias to the entire expression.

## Steps to Reproduce

1. Using the TPC-H schema
2. Starting with this query:
SELECT SUM(l_extendedprice) / 7.0 AS avg_yearly 
FROM lineitem, part 
WHERE p_partkey = l_partkey 
  AND p_brand = 'Brand#23' 
  AND p_container = 'MED BOX' 
  AND l_quantity < (SELECT 0.2 * AVG(l_quantity) 
                    FROM lineitem 
                    WHERE l_partkey = p_partkey);

3. Apply the "{{{}FILTER_SUB_QUERY_TO_CORRELATE"{}}} rule

4. Convert the resulting RelNode to SQL using the MySQL dialect

 

## Expected Behavior

The generated SQL should use syntax that MySQL supports for lateral joins, for 
example:
SELECT SUM(`$cor0`.`l_extendedprice`)  / 7.0 AS `avg_yearly`
FROM `lineitem`, `part`,
LATERAL (SELECT 0.2*AVG(`l_quantity`) AS `EXPR$0`
FROM `lineitem`
WHERE `l_partkey`=`p_partkey`)  AS `t2`
WHERE `p_partkey`=`l_partkey`
AND `p_brand`='Brand#23'
AND `p_container`='MED BOX'
AND `l_quantity`<`t2`.`EXPR$0`;

## Actual Behavior
The converter generates this invalid SQL:

SELECT SUM(`$cor0`.`l_extendedprice`)  / 7.0 AS `avg_yearly`
FROM (`lineitem`, `part`) AS `$cor0`,
LATERAL (SELECT 0.2*AVG(`l_quantity`) AS `EXPR$0`
FROM `lineitem`
WHERE `l_partkey`=`$cor0`.`p_partkey`)  AS `t2`
WHERE `$cor0`.`p_partkey`=`$cor0`.`l_partkey`
AND `$cor0`.`p_brand`='Brand#23'
AND `$cor0`.`p_container`='MED BOX'
AND `$cor0`.`l_quantity`<`t2`.`EXPR$0`;


> RelToSqlConverter generates invalid MySQL syntax for Correlate operator with 
> multiple tables
> --------------------------------------------------------------------------------------------
>
>                 Key: CALCITE-6884
>                 URL: https://issues.apache.org/jira/browse/CALCITE-6884
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.38.0
>            Reporter: Jie Yang
>            Priority: Major
>
> The `RelToSqlConverter` class has a bug when converting a `Correlate` 
> operator to SQL for MySQL dialect. When the left input of the correlate 
> contains multiple tables (like a join), it generates invalid MySQL syntax by 
> placing parentheses around the tables and applying an alias to the entire 
> expression.
> Steps to Reproduce
> 1. Using the TPC-H schema
> 2. Starting with this query:
> SELECT SUM(l_extendedprice) / 7.0 AS avg_yearly 
> FROM lineitem, part 
> WHERE p_partkey = l_partkey 
>   AND p_brand = 'Brand#23' 
>   AND p_container = 'MED BOX' 
>   AND l_quantity < (SELECT 0.2 * AVG(l_quantity) 
>                     FROM lineitem 
>                     WHERE l_partkey = p_partkey);
> 3. Apply the "{{{}FILTER_SUB_QUERY_TO_CORRELATE"{}}} rule
> 4. Convert the resulting RelNode to SQL using the MySQL dialect
>  
> Expected Behavior
> The generated SQL should use syntax that MySQL supports for lateral joins, 
> for example:
> SELECT SUM(`$cor0`.`l_extendedprice`)  / 7.0 AS `avg_yearly`
> FROM `lineitem`, `part`,
> LATERAL (SELECT 0.2*AVG(`l_quantity`) AS `EXPR$0`
> FROM `lineitem`
> WHERE `l_partkey`=`p_partkey`)  AS `t2`
> WHERE `p_partkey`=`l_partkey`
> AND `p_brand`='Brand#23'
> AND `p_container`='MED BOX'
> AND `l_quantity`<`t2`.`EXPR$0`;
> Actual Behavior
> The converter generates this invalid SQL:
> SELECT SUM(`$cor0`.`l_extendedprice`)  / 7.0 AS `avg_yearly`
> FROM (`lineitem`, `part`) AS `$cor0`,
> LATERAL (SELECT 0.2*AVG(`l_quantity`) AS `EXPR$0`
> FROM `lineitem`
> WHERE `l_partkey`=`$cor0`.`p_partkey`)  AS `t2`
> WHERE `$cor0`.`p_partkey`=`$cor0`.`l_partkey`
> AND `$cor0`.`p_brand`='Brand#23'
> AND `$cor0`.`p_container`='MED BOX'
> AND `$cor0`.`l_quantity`<`t2`.`EXPR$0`;



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to