Carter Shanklin created HIVE-16725:
--------------------------------------

             Summary: Support recursive CTEs
                 Key: HIVE-16725
                 URL: https://issues.apache.org/jira/browse/HIVE-16725
             Project: Hive
          Issue Type: Sub-task
            Reporter: Carter Shanklin


Hive introduced non-recursive CTEs in HIVE-1180.

Recursive CTEs are commonly used to navigate hierarchies stored in relational 
tables where a parent ID column "foreign key" refers to another "primary key" 
field within the same table. In this context recursive CTEs are used to 
traverse hierarchies, determine parents / children, measure depths, build paths 
and so on.

Recursive CTEs are constructed similarly to basic CTEs but include 2 queries at 
a minimum: first a root query which is combined via UNION / UNION ALL to 
additional queries that can refer to the CTE's table name.

Support should include:
* Basic recursive CTE support: i.e. allow the CTE's table name to be referred 
in the table subquery after a UNION or UNION ALL.
* Recursive CTEs should be supported as basic queries, in views, or in 
subqueries.
* Loop detection is highly desirable. If a loop is detected the query should 
fail at runtime. Hive is commonly used in shared clusters where it is difficult 
to track down rogue queries.
* To ease portability, suggest  to not require the recursive keyword. It could 
be made optional.
* To ease portability, "with column list", i.e. with t(col1, col2) as ( ... ) 
should be supported.

Example (Postgres compatible):
{code}
create table hierarchy (id integer, parent integer);
insert into hierarchy values (1, null), (2, 1), (3, 2);

with recursive t(id, parent) as (
  select id, parent from hierarchy where parent is null
  union all select hierarchy.id, hierarchy.parent from hierarchy, t where t.id 
= hierarchy.parent
) select * from t;
 id | parent
----+--------
  1 |
  2 |      1
  3 |      2
(3 rows)

update hierarchy set parent = 3 where id = 1;

with recursive t(id, parent) as (
  select id, parent from hierarchy where parent = 1
  union all select hierarchy.id, hierarchy.parent from hierarchy, t where t.id 
= hierarchy.parent
) select * from t;
[ Query runs forever ]
{code}

Implementation Notes:
The SQL standard requires use of the "recursive" keyword for recursive CTEs. 
However, major commercial databases including Oracle, SQL Server and DB2 do not 
require, or in some cases, don't even allow the "recursive" keyword. Postgres 
requires the "recursive" keyword.

If Oracle detects a loop it fails with this message: ORA-32044: cycle detected 
while executing recursive WITH query
If Postgres encounters a loop in a recursive CTE, the query runs forever and 
must be killed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to