[ 
https://issues.apache.org/jira/browse/HIVE-5358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13779645#comment-13779645
 ] 

Chun Chen commented on HIVE-5358:
---------------------------------

Sorry for the misunderstand the intention of checkExprs in 
ReduceSinkDeDuplication.
[~ashutoshc] I will try to preserve the order of key Columns on RS in those 
test cases.

{code}
select c3, c2 from (select c1, c2, c3 from t1 order by c1, c2, c3) t group by 
c3, c2;
{code}
[~yhuai] I don't understand what you mean about the above sql. If we use [c3, 
c2] as key columns, what's the problem of that?
                
> ReduceSinkDeDuplication should ignore column orders when check overlapping 
> part of keys between parent and child
> ----------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-5358
>                 URL: https://issues.apache.org/jira/browse/HIVE-5358
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Chun Chen
>            Assignee: Chun Chen
>         Attachments: D13113.1.patch, HIVE-5358.2.patch, HIVE-5358.patch
>
>
> {code}
> select key, value from (select key, value from src group by key, value) t 
> group by key, value;
> {code}
> This can be optimized by ReduceSinkDeDuplication
> {code}
> select key, value from (select key, value from src group by key, value) t 
> group by value, key;
> {code}
> However the sql above can't be optimized by ReduceSinkDeDuplication currently 
> due to different column orders of parent and child operator.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to