Lino Rosa created CALCITE-7138: ---------------------------------- Summary: RelToSqlConverterTest drops subquery boundary and inflates COUNT results Key: CALCITE-7138 URL: https://issues.apache.org/jira/browse/CALCITE-7138 Project: Calcite Issue Type: Bug Reporter: Lino Rosa
Take this input query: {code:java} SELECT COUNT(CASE WHEN has_salesman = 1 AND has_high_salary = 1 THEN 1 END) AS depts_with_salesman_and_high_salary, COUNT(CASE WHEN has_salesman = 1 THEN 1 END) AS depts_with_salesman, COUNT(CASE WHEN has_high_salary = 1 THEN 1 END) AS depts_with_high_salary FROM ( SELECT MAX(CASE WHEN "position_title" = 'SALESMAN' THEN 1 ELSE 0 END) AS has_salesman, MAX(CASE WHEN "salary" > 1250 THEN 1 ELSE 0 END) AS has_high_salary FROM "foodmart"."employee" GROUP BY "department_id" ) AS dept_flags;{code} After going through `SqlToRelConverter`, the resulting SQL will have the GROUP BY from the subquery inlined into a single query: {code:java} SELECT COUNT( CASE WHEN MAX(CASE WHEN "position_title" = 'SALESMAN' THEN 1 ELSE 0 END) = 1 AND MAX(CASE WHEN CAST("salary" AS DECIMAL(14, 4)) > 1250.0000 THEN 1 ELSE 0 END) = 1 THEN 1 ELSE NULL END ) AS "DEPTS_WITH_SALESMAN_AND_HIGH_SALARY", COUNT( CASE WHEN MAX(CASE WHEN "position_title" = 'SALESMAN' THEN 1 ELSE 0 END) = 1 THEN 1 ELSE NULL END ) AS "DEPTS_WITH_SALESMAN", COUNT( CASE WHEN MAX(CASE WHEN CAST("salary" AS DECIMAL(14, 4)) > 1250.0000 THEN 1 ELSE 0 END) = 1 THEN 1 ELSE NULL END ) AS "DEPTS_WITH_HIGH_SALARY" FROM "foodmart"."employee" GROUP BY "department_id"; {code} This is not equivalent to the first query. The first query counts departments, while the second—due to the GROUP BY department_id and how COUNT(...) is used—counts rows (employees) inside each qualifying department. For example, for this dataset: ||department_id||position_title||salary|| |10|SALESMAN|1000| |10|CLERK|3000| |20|MANAGER|2000| |30|SALESMAN|1100| The first query results in: ||depts_with_salesman_and_high_salary||depts_with_salesman||depts_with_high_salary|| |1|2|2| While the second query returns {*}one row per department{*}, and the numbers are row counts, not department counts: ||depts_with_salesman_and_high_salary||depts_with_salesman||depts_with_high_salary|| |2|2|2| |0|0|1| |0|1|0| -- This message was sent by Atlassian Jira (v8.20.10#820010)