Gopal V created HIVE-10047: ------------------------------ Summary: LLAP: VectorMapJoinOperator gets an over-flow on batchSize of 1024 Key: HIVE-10047 URL: https://issues.apache.org/jira/browse/HIVE-10047 Project: Hive Issue Type: Sub-task Affects Versions: llap Reporter: Gopal V Fix For: llap
Simple LLAP queries on constrained resources runs into an exception which suggests that the {code} Caused by: java.lang.ArrayIndexOutOfBoundsException: 1024 at org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory$VectorLongColumnAssign.assignLong(VectorColumnAssignFactory.java:113) at org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory$9.assignObjectValue(VectorColumnAssignFactory.java:293) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.internalForward(VectorMapJoinOperator.java:196) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:653) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:656) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:752) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:316) ... 22 more {code} The relevant line is due to the check for a full output-batch being outside of the loop here - looks like it can be triggered during MxN joins where there are more values than there were input rows in the input batch. {code} for (int i=0; i<values.length; ++i) { vcas[i].assignObjectValue(values[i], outputBatch.size); } ++outputBatch.size; if (outputBatch.size == VectorizedRowBatch.DEFAULT_SIZE) { flushOutput(); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)