[ https://issues.apache.org/jira/browse/HIVE-21531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16803607#comment-16803607 ]
Gopal V commented on HIVE-21531: -------------------------------- {code} $ scala -cp ~/hw/hive/ql/target/hive-exec-3.2.0-SNAPSHOT.jar Picked up _JAVA_OPTIONS: -Djava.awt.headless=true -Xmx2048m Welcome to Scala 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0). Type in expressions for evaluation. Or try :help. scala> import org.apache.hive.common.util.Murmur3; import org.apache.hive.common.util.Murmur3 scala> import org.apache.hive.common.util.HashCodeUtil; import org.apache.hive.common.util.HashCodeUtil scala> val bytes = Array[Byte]('a','b','c','d'); bytes: Array[Byte] = Array(97, 98, 99, 100) scala> HashCodeUtil.calculateBytesHashCode(bytes, 0, 4); res0: Int = 646393889 scala> Murmur3.hash32(bytes, 0, 4, 0); res1: Int = 1139631978 {code} > Vectorization: all NULL hashcodes are not computed using Murmur3 > ---------------------------------------------------------------- > > Key: HIVE-21531 > URL: https://issues.apache.org/jira/browse/HIVE-21531 > Project: Hive > Issue Type: Bug > Reporter: Gopal V > Assignee: Gopal V > Priority: Major > > The comments in Vectorized hash computation call out the MurmurHash > implementation (the one using 0x5bd1e995), while the non-vectorized codepath > calls out the Murmur3 one (using 0xcc9e2d51). > The comments here are wrong > {code} > /** > * Batch compute the hash codes for all the serialized keys. > * > * NOTE: MAJOR MAJOR ASSUMPTION: > * We assume that HashCodeUtil.murmurHash produces the same result > * as MurmurHash.hash with seed = 0 (the method used by > ReduceSinkOperator for > * UNIFORM distribution). > */ > protected void computeSerializedHashCodes() { > int offset = 0; > int keyLength; > byte[] bytes = output.getData(); > for (int i = 0; i < nonNullKeyCount; i++) { > keyLength = serializedKeyLengths[i]; > hashCodes[i] = Murmur3.hash32(bytes, offset, keyLength, 0); > offset += keyLength; > } > } > {code} > but the wrong comment is followed in the Vector RS operator > {code} > System.arraycopy(nullKeyOutput.getData(), 0, nullBytes, 0, > nullBytesLength); > nullKeyHashCode = HashCodeUtil.calculateBytesHashCode(nullBytes, 0, > nullBytesLength); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)