Hi Hive users,
According to the grammar for Joins at
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Joins,
Hive should be able to parse joins where the table reference on the
right-hand side is itself a join. For instance, one should be able to
specify (a join (b join c)) as:
a LEFT SEMI JOIN b LEFT SEMI JOIN c ON b.x = c.x ON a.y = b.y
However, I have been unsuccessful in issuing such queries to Hive. I am
using Hive-0.9.0:
hive> select * from orderdetails left semi join orders left semi join
sites on (sites.siteid = orders.siteid) on (orderdetails.orderid =
orders.orderid);
FAILED: Parse Error: line 1:104 mismatched input 'on' expecting EOF near ')'
Explicit parenthesization of the join leads to parse error too --
parenthesis supposedly signals start of a subquery.
hive> select * from orderdetails left semi join ( orders left semi join
sites on (sites.siteid = orders.siteid) ) on
(orderdetails.orderid=orders.orderid);
FAILED: Parse Error: line 1:44 cannot recognize input near 'orders'
'left' 'semi' in subquery source
1. It seems to me that Hive parser currently only supports joins
specified in a left-recursive manner, i.e, rhs limited to a table name.
Is this correct? The manual merely says that all Joins are assumed
left-associative, but I could not find a mention of whether
parenthesization is allowed for nesting.
2. Are there plans to support arbitrarily nested joins in future, if not
currently allowed?
Regards,
Avijit