Re: Hi, Hive People urgent question about [Distribute By] function

Gopal Vijayaraghavan Thu, 22 Oct 2015 09:26:39 -0700

> When applying [Distribute By] on Hive to the framework, the function
>should be partitionByHash on Flink. This is to spread out all the rows
>distributed by a hash key from Object Class in Java.


Hive does not use the Object hashCode - the identityHashCode is
inconsistent, so Object.hashCode() .

ObjectInspectorUtils::hashCode() is the hashcode used by the DBY in hive
(SORT BY uses a Random number generator).

Cheers,
Gopal

Re: Hi, Hive People urgent question about [Distribute By] function

Reply via email to