[
https://issues.apache.org/jira/browse/IMPALA-13483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17950762#comment-17950762
]
weihua zhang edited comment on IMPALA-13483 at 5/13/25 3:28 AM:
----------------------------------------------------------------
{code:sql}
select t1.* from t1 left join t2 on t1.k2 = t2.k3 and t1.k1 not in ( select
t3.k1 from t3 where t1.k2 = t3.k2 ) or t1.k1 < 10 order by t1.k1, t1.k2
calcite 1.39 can generate plan, but return wrong result
{code}
{code:sql}
select t1.* from t1 left join t2 on t1.k2 = t2.k3 and t1.k1 in ( select t3.k1
from t3 ) order by t1.k1, t1.k2
calcite 1.39 can generate plan, but return wrong result
{code}
{code:sql}
select t1.* from t1 left join t2 on t1.k2 = t2.k3 and t1.k1 in ( select t3.k1
from t3 where t1.k2 < t3.k2 ) order by t1.k1, t1.k2
calcite 1.39 can generate plan, but return wrong result
{code}
{code:sql}
select * from t1 where exists(select distinct k1 from t2 where t1.k1 > t2.k3 or
t1.k2 < t2.v1) order by t1.k1, t1.k2
calcite's bug, but calcite main branch have fixed it!
+-------------------+-------------------+
| k1 (tid=9 sid=16) | k2 (tid=9 sid=17) |
+-------------------+-------------------+
| 1 | 2 |
| 1 | 3 |
| 2 | 4 |
| 3 | 3 |
| 3 | 4 |
| 20 | 2 |
| 22 | 3 |
| 24 | 4 |
+-------------------+-------------------+
Fetched 8 row(s) in 0.53s
-- duckdb result
select * from t1 where exists(select distinct k1 from t2 where t1.k1 > t2.k3 or
t1.k2 < t2.v1) order by t1.k1, t1.k2;
┌────┬────┐
│ k1 ┆ k2 │
╞════╪════╡
│ 1 ┆ 2 │
│ 1 ┆ 3 │
│ 2 ┆ 4 │
│ 3 ┆ 3 │
│ 3 ┆ 4 │
│ 20 ┆ 2 │
│ 22 ┆ 3 │
│ 24 ┆ 4 │
│ ┆ 1 │
│ ┆ 2 │
└────┴────┘
{code}
was (Author: JIRAUSER307426):
{code:sql}
select t1.* from t1 left join t2 on t1.k2 = t2.k3 and t1.k1 not in ( select
t3.k1 from t3 where t1.k2 = t3.k2 ) or t1.k1 < 10 order by t1.k1, t1.k2
calcite 1.39 can generate plan, but return wrong result
{code}
{code:sql}
select t1.* from t1 left join t2 on t1.k2 = t2.k3 and t1.k1 in ( select t3.k1
from t3 ) order by t1.k1, t1.k2
calcite 1.39 can generate plan, but return wrong result
{code}
{code:sql}
select t1.* from t1 left join t2 on t1.k2 = t2.k3 and t1.k1 in ( select t3.k1
from t3 where t1.k2 < t3.k2 ) order by t1.k1, t1.k2
calcite 1.39 can generate plan, but return wrong result
{code}
{code:sql}
select * from t1 where exists(select distinct k1 from t2 where t1.k1 > t2.k3 or
t1.k2 < t2.v1) order by t1.k1, t1.k2
calcite's bug
+-------------------+-------------------+
| k1 (tid=9 sid=16) | k2 (tid=9 sid=17) |
+-------------------+-------------------+
| 1 | 2 |
| 1 | 3 |
| 2 | 4 |
| 3 | 3 |
| 3 | 4 |
| 20 | 2 |
| 22 | 3 |
| 24 | 4 |
+-------------------+-------------------+
Fetched 8 row(s) in 0.53s
-- duckdb result
select * from t1 where exists(select distinct k1 from t2 where t1.k1 > t2.k3 or
t1.k2 < t2.v1) order by t1.k1, t1.k2;
┌────┬────┐
│ k1 ┆ k2 │
╞════╪════╡
│ 1 ┆ 2 │
│ 1 ┆ 3 │
│ 2 ┆ 4 │
│ 3 ┆ 3 │
│ 3 ┆ 4 │
│ 20 ┆ 2 │
│ 22 ┆ 3 │
│ 24 ┆ 4 │
│ ┆ 1 │
│ ┆ 2 │
└────┴────┘
{code}
> Calcite Planner: some scalar subquery throws exception when handle
> single_value
> -------------------------------------------------------------------------------
>
> Key: IMPALA-13483
> URL: https://issues.apache.org/jira/browse/IMPALA-13483
> Project: IMPALA
> Issue Type: Sub-task
> Reporter: weihua zhang
> Priority: Major
>
> {code:sql}
> create table correlated_scalar_t1(c1 bigint, c2 bigint);
> create table correlated_scalar_t2(c1 bigint, c2 bigint);
> insert into correlated_scalar_t1 values (1,null),(null,1),(1,2),
> (null,2),(1,3), (2,4), (2,5), (3,3), (3,4), (20,2), (22,3),
> (24,4),(null,null);
> insert into correlated_scalar_t2 values (1,null),(null,1),(1,4), (1,2),
> (null,3), (2,4), (3,7), (3,9),(null,null),(5,1);
> select c1 from correlated_scalar_t1 where correlated_scalar_t1.c2 > (select
> c1 from correlated_scalar_t2 where correlated_scalar_t1.c1 =
> correlated_scalar_t2.c1 and correlated_scalar_t2.c2 < 4) order by c1;{code}
> {code:java}
> LogicalSort(sort0=[$0], dir0=[ASC]), id = 717
> LogicalProject(C1=[$0]), id = 716
> LogicalJoin(condition=[AND(=($0, $2), >($1, $3))], joinType=[inner]), id
> = 715
> LogicalTableScan(table=[[default, correlated_scalar_t1]]), id = 547
> LogicalAggregate(group=[{0}], agg#0=[SINGLE_VALUE($1)]), id = 714
> LogicalProject(c11=[$0], C1=[$0]), id = 713
> LogicalFilter(condition=[AND(<($1, 4), IS NOT NULL($0))]), id = 712
> LogicalTableScan(table=[[default, correlated_scalar_t2]]), id =
> 549
> {code}
> Exception: java.lang.IndexOutOfBoundsException: Index: 3, Size: 3
> may be related to single_value
> hive plan:
> {code:java}
> explain cbo select c1 from correlated_scalar_t1 where correlated_scalar_t1.c2
> > (select c1 from correlated_scalar_t2 where correlated_scalar_t1.c1 =
> correlated_scalar_t2.c1 and correlated_scalar_t2.c2 < 4) order by c1;
> +----------------------------------------------------+
> | Explain |
> +----------------------------------------------------+
> | CBO PLAN: |
> | HiveSortLimit(sort0=[$0], dir0=[ASC]) |
> | HiveProject(c1=[$0]) |
> | HiveJoin(condition=[AND(=($0, $4), >($1, $3))], joinType=[inner],
> algorithm=[none], cost=[not available]) |
> | HiveJoin(condition=[=($0, $2)], joinType=[left], algorithm=[none],
> cost=[not available]) |
> | HiveProject(c1=[$0], c2=[$1]) |
> | HiveFilter(condition=[AND(IS NOT NULL($0), IS NOT NULL($1))]) |
> | HiveTableScan(table=[[default, correlated_scalar_t1]],
> table:alias=[correlated_scalar_t1]) |
> | HiveProject(c10=[$0]) |
> | HiveFilter(condition=[sq_count_check($1)]) |
> | HiveAggregate(group=[{0}], cnt=[COUNT()]) |
> | HiveFilter(condition=[AND(<($1, 4), IS NOT NULL($0))]) |
> | HiveTableScan(table=[[default, correlated_scalar_t2]],
> table:alias=[correlated_scalar_t2]) |
> | HiveProject(c1=[$0], c10=[$0]) |
> | HiveFilter(condition=[AND(<($1, 4), IS NOT NULL($0))]) |
> | HiveTableScan(table=[[default, correlated_scalar_t2]],
> table:alias=[correlated_scalar_t2]) |
> | |
> +----------------------------------------------------+
> 17 rows selected (0.935 seconds)
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]