[ https://issues.apache.org/jira/browse/HIVE-17010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16077534#comment-16077534 ]
Rui Li edited comment on HIVE-17010 at 7/7/17 3:53 AM: ------------------------------------------------------- +1 was (Author: lirui): 1 > Fix the overflow problem of Long type in SetSparkReducerParallelism > ------------------------------------------------------------------- > > Key: HIVE-17010 > URL: https://issues.apache.org/jira/browse/HIVE-17010 > Project: Hive > Issue Type: Bug > Reporter: liyunzhang_intel > Assignee: liyunzhang_intel > Attachments: HIVE-17010.1.patch, HIVE-17010.2.patch, > HIVE-17010.3.patch > > > We use > [numberOfByteshttps://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L129] > to collect the numberOfBytes of sibling of specified RS. We use Long type > and it happens overflow when the data is too big. After happening this > situation, the parallelism is decided by > [sparkMemoryAndCores.getSecond()|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L184] > if spark.dynamic.allocation.enabled is true, sparkMemoryAndCores.getSecond > is a dymamic value which is decided by spark runtime. For example, the value > of sparkMemoryAndCores.getSecond is 5 or 15 randomly. There is possibility > that the value may be 1. The may problem here is the overflow of addition of > Long type. You can reproduce the overflow problem by following code > {code} > public static void main(String[] args) { > long a1= 9223372036854775807L; > long a2=1022672; > long res = a1+a2; > System.out.println(res); //-9223372036853753137 > BigInteger b1= BigInteger.valueOf(a1); > BigInteger b2 = BigInteger.valueOf(a2); > BigInteger bigRes = b1.add(b2); > System.out.println(bigRes); //9223372036855798479 > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)