[jira] [Commented] (HIVE-15544) Support scalar subqueries

Xuefu Zhang (JIRA) Mon, 16 Jan 2017 21:49:23 -0800

    [ 
https://issues.apache.org/jira/browse/HIVE-15544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15825491#comment-15825491
 ]


Xuefu Zhang commented on HIVE-15544:
------------------------------------

Re: Can you explain what do you mean by semantic problem at runtime? 

Since Hive doesn't know if a subquery will produce a single value, so as you 
proposed, Hive fails the query at runtime if it doesn't. This basically pushes 
a semantic problem (by relaxing semantic check) to a runtime problem. It seems 
so harsh (and probably unacceptable) that a query that has run for hours ends 
up with failure.

I didn't read your patch, but let me know if I misunderstood your proposal.

> Support scalar subqueries
> -------------------------
>
>                 Key: HIVE-15544
>                 URL: https://issues.apache.org/jira/browse/HIVE-15544
>             Project: Hive
>          Issue Type: Sub-task
>          Components: SQL
>            Reporter: Vineet Garg
>            Assignee: Vineet Garg
>              Labels: sub-query
>         Attachments: HIVE-15544.1.patch, HIVE-15544.2.patch, 
> HIVE-15544.3.patch
>
>
> Currently HIVE only support IN/EXISTS/NOT IN/NOT EXISTS subqueries. HIVE 
> doesn't allow sub-queries such as:
> {code}
> explain select  a.ca_state state, count(*) cnt
>  from customer_address a
>      ,customer c
>      ,store_sales s
>      ,date_dim d
>      ,item i
>  where       a.ca_address_sk = c.c_current_addr_sk
>       and c.c_customer_sk = s.ss_customer_sk
>       and s.ss_sold_date_sk = d.d_date_sk
>       and s.ss_item_sk = i.i_item_sk
>       and d.d_month_seq = 
>            (select distinct (d_month_seq)
>             from date_dim
>                where d_year = 2000
>               and d_moy = 2 )
>       and i.i_current_price > 1.2 * 
>              (select avg(j.i_current_price) 
>            from item j 
>            where j.i_category = i.i_category)
>  group by a.ca_state
>  having count(*) >= 10
>  order by cnt 
>  limit 100;
> {code}
> We initially plan to support such scalar subqueries in filter i.e. WHERE and 
> HAVING



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15544) Support scalar subqueries

Reply via email to