[ https://issues.apache.org/jira/browse/HIVE-784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13679907#comment-13679907 ]
Matthew Weaver commented on HIVE-784: ------------------------------------- I'm looking into uncorrelated subqueries. Will pick this up unless someone else is actively working it. h5. Functional Requirements * Support {{WHERE x IN (<column subquery>);}} ** {{<column subquery>}} returns one column, any number of rows. * Support {{WHERE x NOT IN (<column subquery>)}}; * Support same types of subqueries in {{HAVING}}. ** E.g. {code:sql} SELECT key FROM t1 GROUP BY key HAVING COUNT(value) IN (SELECT p FROM t2); {code} * Correlated subqueries not supported, for now at least ** But still need to check for correlation, and bail if it occurs. ** Correlated subquery: *** A subquery that references a table that appears in a containing query ([MySQL|http://dev.mysql.com/doc/refman/5.7/en/correlated-subqueries.html]), thus requiring subquery evaluation to look outside its scope. *** The subquery depends on the outer query for its values, so the subquery must be executed once for each row of the outer query. Also known as _repeating Subqueries_. h5. Tasks * Rewrite {{IN (<column-subquery>)}} as a {{LEFT SEMI JOIN}}. ** Not ready for public consumption. In particular, no check for correlated terms. ** With test queries. * Add check for correlated terms, return informative error message. * Rewrite {{WHERE NOT IN (<column-subquery>)}} as a {{LEFT OUTER JOIN}}. ** Return rows that don't match the right side * Rewrite subqueries in {{HAVING}}, using {{LEFT SEMI JOIN}} and {{LEFT OUTER JOIN}} as above. > Support uncorrelated subqueries in the WHERE clause > --------------------------------------------------- > > Key: HIVE-784 > URL: https://issues.apache.org/jira/browse/HIVE-784 > Project: Hive > Issue Type: New Feature > Components: Query Processor > Reporter: Ning Zhang > Assignee: Matthew Weaver > > Hive currently only support views in the FROM-clause, some Facebook use cases > suggest that Hive should support subqueries such as those connected by > IN/EXISTS in the WHERE-clause. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira