[ 
https://issues.apache.org/jira/browse/HIVE-12025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14947706#comment-14947706
 ] 

Prasanth Jayachandran commented on HIVE-12025:
----------------------------------------------

The changes introduced in this patch in BucketIdResolverImpl is the correct way 
to compute bucket number. ReduceSinkOperator had a bug in bucket number 
computation regarding negative hashcodes (multiplying by -1 vs mast with 
Int.MAX). There might be some test failures related to this change but that is 
the expected change. Since these are util methods, it will be good to have unit 
tests for these (if one doesnot exist).

Other than that, lgtm +1. Pending tests.

> refactor bucketId generating code
> ---------------------------------
>
>                 Key: HIVE-12025
>                 URL: https://issues.apache.org/jira/browse/HIVE-12025
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 1.0.1
>            Reporter: Eugene Koifman
>            Assignee: Eugene Koifman
>         Attachments: HIVE-12025.2.patch, HIVE-12025.patch
>
>
> HIVE-11983 adds ObjectInspectorUtils.getBucketHashCode() and 
> getBucketNumber().
> There are several (at least) places in Hive that perform this computation:
> # ReduceSinkOperator.computeBucketNumber
> # ReduceSinkOperator.computeHashCode
> # BucketIdResolverImpl - only in 2.0.0 ASF line
> # FileSinkOperator.findWriterOffset
> # GenericUDFHash
> Should refactor it and make sure they all call methods from 
> ObjectInspectorUtils.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to