Hello,

We were approached to add support for Hive SQL in jOOQ [1], which might be 
useful for a greater community in general. I've gone through the Hive SQL 
syntax and I've encountered this interesting clause: The LEFT SEMI JOIN clause 
[2]

Example:
SELECT a.key, a.val
FROM a LEFT SEMI JOIN b on (a.key = b.key)

My question is: Why "LEFT"? I find this misleading. Compare this with a LEFT 
OUTER JOIN, which will always return all tuples from relation A at least once, 
regardless if there are any tuples in relation B matched by the JOIN predicate 
in "ON". So in other words, a LEFT SEMI JOIN is completely useless as it 
*should* always return ALL tuples from relation A, compared to a more useful 
"SEMI JOIN" or "INNER SEMI JOIN".

What do you think?
Lukas

[1]: http://www.jooq.org
[2]: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Joins

--
Lukas Eder - Head of R&D | lukas.e...@datageekery.com | +41 44 586 82 56
Data Geekery GmbH | Binzstrasse 23 | CH-8045 Zürich | Switzerland

http://www.datageekery.com | Get back in control of your SQLT



Reply via email to