[jira] [Commented] (HIVE-7767) hive.optimize.union.remove does not work properly [Spark Branch]

Na Yang (JIRA) Tue, 19 Aug 2014 17:33:31 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-7767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14103146#comment-14103146
 ]


Na Yang commented on HIVE-7767:
-------------------------------

By looking into this issue, I find out the reason that caused this issue.

The "hive.optimize.union.remove=true" optimizer removes the union operator from 
the operator tree and ends up generating two graphs in the spark transformation 
graph. The current GraphTran execute API is not able to handle multiples graphs 
properly. We need to change the execute impl in GraphTran.java to make it 
handle multiple transformation graphs.  I will upload a patch shortly after 
HIVE-7717's patch is committed.   

> hive.optimize.union.remove does not work properly [Spark Branch]
> ----------------------------------------------------------------
>
>                 Key: HIVE-7767
>                 URL: https://issues.apache.org/jira/browse/HIVE-7767
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Na Yang
>            Assignee: Na Yang
>
> Turing on the hive.optimize.union.remove property generates wrong union all 
> result. 
> For Example:
> {noformat}
> create table inputTbl1(key string, val string) stored as textfile;
> load data local inpath '../../data/files/T1.txt' into table inputTbl1;
> SELECT *
> FROM (
>   SELECT key, count(1) as values from inputTbl1 group by key
>   UNION ALL
>   SELECT key, count(1) as values from inputTbl1 group by key
> ) a;  
> {noformat}
> when the hive.optimize.union.remove is turned on, the query result is like: 
> {noformat}
> 1     1
> 2     1
> 3     1
> 7     1
> 8     2
> {noformat}
> when the hive.optimize.union.remove is turned off, the query result is like: 
> {noformat}
> 7     1
> 2     1
> 8     2
> 3     1
> 1     1
> 7     1
> 2     1
> 8     2
> 3     1
> 1     1
> {noformat}
> The expected query result is:
> {noformat}
> 7     1
> 2     1
> 8     2
> 3     1
> 1     1
> 7     1
> 2     1
> 8     2
> 3     1
> 1     1
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7767) hive.optimize.union.remove does not work properly [Spark Branch]

Reply via email to