[ https://issues.apache.org/jira/browse/HIVE-15544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15825491#comment-15825491 ]
Xuefu Zhang commented on HIVE-15544: ------------------------------------ Re: Can you explain what do you mean by semantic problem at runtime? Since Hive doesn't know if a subquery will produce a single value, so as you proposed, Hive fails the query at runtime if it doesn't. This basically pushes a semantic problem (by relaxing semantic check) to a runtime problem. It seems so harsh (and probably unacceptable) that a query that has run for hours ends up with failure. I didn't read your patch, but let me know if I misunderstood your proposal. > Support scalar subqueries > ------------------------- > > Key: HIVE-15544 > URL: https://issues.apache.org/jira/browse/HIVE-15544 > Project: Hive > Issue Type: Sub-task > Components: SQL > Reporter: Vineet Garg > Assignee: Vineet Garg > Labels: sub-query > Attachments: HIVE-15544.1.patch, HIVE-15544.2.patch, > HIVE-15544.3.patch > > > Currently HIVE only support IN/EXISTS/NOT IN/NOT EXISTS subqueries. HIVE > doesn't allow sub-queries such as: > {code} > explain select a.ca_state state, count(*) cnt > from customer_address a > ,customer c > ,store_sales s > ,date_dim d > ,item i > where a.ca_address_sk = c.c_current_addr_sk > and c.c_customer_sk = s.ss_customer_sk > and s.ss_sold_date_sk = d.d_date_sk > and s.ss_item_sk = i.i_item_sk > and d.d_month_seq = > (select distinct (d_month_seq) > from date_dim > where d_year = 2000 > and d_moy = 2 ) > and i.i_current_price > 1.2 * > (select avg(j.i_current_price) > from item j > where j.i_category = i.i_category) > group by a.ca_state > having count(*) >= 10 > order by cnt > limit 100; > {code} > We initially plan to support such scalar subqueries in filter i.e. WHERE and > HAVING -- This message was sent by Atlassian JIRA (v6.3.4#6332)