[ 
https://issues.apache.org/jira/browse/HIVE-5358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-5358:
------------------------------

    Attachment: D13113.1.patch

chenchun requested code review of "HIVE-5358 [jira] ReduceSinkDeDuplication 
should ignore column orders when check overlapping part of keys between parent 
and child".

Reviewers: JIRA

HIVE-5358

select key, value from (select key, value from src group by key, value) t group 
by key, value;

This can be optimized by ReduceSinkDeDuplication

select key, value from (select key, value from src group by key, value) t group 
by value, key;

However the sql above can't be optimized by ReduceSinkDeDuplication currently 
due to different column orders of parent and child operator.

TEST PLAN
  EMPTY

REVISION DETAIL
  https://reviews.facebook.net/D13113

AFFECTED FILES
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/ReduceSinkDeDuplication.java
  ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeColumnListDesc.java
  ql/src/test/queries/clientpositive/reduce_deduplicate_extended.q
  ql/src/test/results/clientpositive/reduce_deduplicate_extended.q.out

MANAGE HERALD RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/38295/

To: JIRA, chenchun

                
> ReduceSinkDeDuplication should ignore column orders when check overlapping 
> part of keys between parent and child
> ----------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-5358
>                 URL: https://issues.apache.org/jira/browse/HIVE-5358
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Chun Chen
>            Assignee: Chun Chen
>         Attachments: D13113.1.patch, HIVE-5358.patch
>
>
> {code}
> select key, value from (select key, value from src group by key, value) t 
> group by key, value;
> {code}
> This can be optimized by ReduceSinkDeDuplication
> {code}
> select key, value from (select key, value from src group by key, value) t 
> group by value, key;
> {code}
> However the sql above can't be optimized by ReduceSinkDeDuplication currently 
> due to different column orders of parent and child operator.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to