Harish Butani created HIVE-6030:
-----------------------------------
Summary: Introduce Explain SubQuery Rewrite command
Key: HIVE-6030
URL: https://issues.apache.org/jira/browse/HIVE-6030
Project: Hive
Issue Type: Improvement
Reporter: Harish Butani
There are several transformations happening for SubQuery predicates(HIVE-784).
It is hard to tell from the explain plan how the Query is executed.
So the goal is to introduce an explain subquery rewrite command that will show
the details on the rewrite. Fo e.g.:
{noformat}
-- non corr e.g.
explain subquery rewrite select * from src where src.key in (select key from
src s1 where s1.key > '9');
-- outputs:
select * from src left semi join (select key from src s1 where s1.key > '9')
sq_1 on src.key = sq_1.key where 1 = 1
-- corr e.g.
explain subquery rewrite select * from src where src.key in (select key from
src s1 where s1.key > '9');
-- outputs
select key
from src b left semi join (select min(value), a.key as sq_corr_0
from (select key, value, rank() over(partition by key order by value) as r
from src) a
where r <= 2
group by a.key) sq_1 on b.key = sq_1.sq_corr_0 and b.value = sq_1._c0
where 1 = 1 and key < '9'
{noformat}
There are multiple rewrite cases:
- corr vs non-cor
- when 'not in' operator is involved
- a where clause predicate vs. a having clause predicate
Not sure if it will be possible to output a valid Hive SQL query for all cases;
but at least provide the user enough information to assemble a valid Hive query.
Breaking this down into multiple subtasks.
--
This message was sent by Atlassian JIRA
(v6.1.4#6159)