Gopal V created HIVE-8349:
-----------------------------

             Summary: DISTRIBUTE BY should work with tez auto-parallelism 
enabled
                 Key: HIVE-8349
                 URL: https://issues.apache.org/jira/browse/HIVE-8349
             Project: Hive
          Issue Type: Bug
            Reporter: Gopal V


Current implementation of DISTRIBUTE BY does not work when tez auto-parallelism 
is turned on, because of hashCode distribution issues.

In case of distribute by, the key is actually zero bytes, with only 
partitioning enabled via hashCode - this adversely affects the uniform hashing 
implementation.

In an ideal scenario, the edge should go from the ordered kv input to the 
unordered partitioned edge, to speed up the processing massively.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to