[jira] Created: (HIVE-1772) optimize join followed by a groupby

Namit Jain (JIRA) Fri, 05 Nov 2010 16:31:07 -0700

optimize join followed by a groupby
-----------------------------------

                 Key: HIVE-1772
                 URL: https://issues.apache.org/jira/browse/HIVE-1772
             Project: Hive
          Issue Type: Improvement
          Components: Query Processor
            Reporter: Namit Jain



explain SELECT x.key, count(1) FROM src1 x JOIN src y ON (x.key = y.key) group 
by x.key;


STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-2 depends on stages: Stage-1
  Stage-0 is a root stage


The above query issues 2 map-reduce jobs. 
The first MR job performs the join, whereas the second MR performs the group by.
Since the data is already sorted, the group by can be performed in the reducer 
of the join itself.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HIVE-1772) optimize join followed by a groupby

Reply via email to