[ https://issues.apache.org/jira/browse/HIVE-21746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16841809#comment-16841809 ]
Jason Dere commented on HIVE-21746: ----------------------------------- I believe the dynamically partitioned hash join has issues when the join keys are constant folded. Looking at the ReduceSink output that feeds into the dynamically partitioned hash join: {noformat} Reduce Output Operator key expressions: _col20 (type: string), 'HR3' (type: string) null sort order: aa sort order: ++ Map-reduce partition columns: _col20 (type: string), 'HR3' (type: string) Statistics: Num rows: 3800000 Data size: 1288485344 Basic stats: COMPLETE Column stats: PARTIAL tag: 0 value expressions: _col2 (type: timestamp), _col3 (type: timestamp), _col51 (type: timestamp), _col124 (type: timestamp) {noformat} So the value expressions in the ReduceSink consists of 4 timestamp columns. And it appears that the data written out and sent to the Join also matches that. However, the input schema to the MapJoin operator shows 5 columns rather than 4: {noformat} *** valCols[0] for JOIN JOIN_13: [Column[VALUE._col2], Column[VALUE._col3], Column[KEY.reducesinkkey1], Column[VALUE._col49], Column[VALUE._col122]] {noformat} With types (timestamp, timestamp, string, timestamp, timestamp) Note that the third column in this list is KEY.reducesinkkey1. Key columns should have been filtered out from the values columns in MapJoinProcessor.getMapJoinDesc(), during the section that populates valueTableDescs. But the keyExprMap generated by ExprNodeDescUtils.resolveJoinKeysAsRSColumns(), which is only done for dynamically partitioned hash join, does not properly match the KEY.reducesinkkey1 column from the ReduceSinkOperator, when filtering the key columns from the value columns. The column reference generated from the constant folded column, in keyExprMap: {noformat} 1 = {ExprNodeColumnDesc@9714} "Column[KEY.reducesinkkey1]" column = "KEY.reducesinkkey1" tabAlias = "" isPartitionColOrVirtualCol = false isSkewedCol = false typeInfo = {PrimitiveTypeInfo@9719} "string" {noformat} What should have been the corresponding key in the ReduceSinkOperator: {noformat} expr = {ExprNodeColumnDesc@8704} "Column[KEY.reducesinkkey1]" column = "KEY.reducesinkkey1" tabAlias = "t2" isPartitionColOrVirtualCol = true isSkewedCol = false typeInfo = {PrimitiveTypeInfo@9719} "string" {noformat} The difference is the ReduceSinkOperator key has tabAlias = "t2". The one generated by ExprNodeDescUtils.resolveJoinKeysAsRSColumns() currently has a tabAlias hardcoded to "". One solution is for ExprNodeConstantDesc to keep a foldedFromTab for the table alias, in addition to foldedFromCol which it already has. That way ExprNodeDescUtils.resolveJoinKeysAsRSColumns() can generate a column reference with the same matching tableAlias as its parent ReduceSinkOperator. > ArrayIndexOutOfBoundsException during dynamically partitioned hash join, with > CBO disabled > ------------------------------------------------------------------------------------------ > > Key: HIVE-21746 > URL: https://issues.apache.org/jira/browse/HIVE-21746 > Project: Hive > Issue Type: Bug > Components: Query Planning > Reporter: Jason Dere > Assignee: Jason Dere > Priority: Major > > ArrayIndexOutOfBounds exception during query execution with dynamically > partitioned hash join. > Found on Hive 2.x. Seems to occur with CBO disabled/failed. > Disabling constant propagation seems to allow the query to succeed. > {noformat} > java.lang.ArrayIndexOutOfBoundsException: 203 > at > org.apache.hadoop.hive.serde2.io.TimestampWritable.getTotalLength(TimestampWritable.java:217) > ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3] > at > org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.checkObjectByteInfo(LazyBinaryUtils.java:205) > ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3] > at > org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.parse(LazyBinaryStruct.java:142) > ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3] > at > org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.getFieldsAsList(LazyBinaryStruct.java:281) > ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3] > at > org.apache.hadoop.hive.ql.exec.persistence.MapJoinBytesTableContainer$ReusableRowContainer.unpack(MapJoinBytesTableContainer.java:744) > ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3] > at > org.apache.hadoop.hive.ql.exec.persistence.MapJoinBytesTableContainer$ReusableRowContainer.next(MapJoinBytesTableContainer.java:730) > ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3] > at > org.apache.hadoop.hive.ql.exec.persistence.MapJoinBytesTableContainer$ReusableRowContainer.next(MapJoinBytesTableContainer.java:605) > ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3] > at > org.apache.hadoop.hive.ql.exec.persistence.UnwrapRowContainer.next(UnwrapRowContainer.java:70) > ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3] > at > org.apache.hadoop.hive.ql.exec.persistence.UnwrapRowContainer.next(UnwrapRowContainer.java:34) > ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3] > at > org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:819) > ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3] > at > org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:924) > ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3] > at > org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:456) > ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3] > at > org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:359) > ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3] > at > org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:290) > ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3] > at > org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:319) > ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3] > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:189) > ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3] > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:172) > ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3] > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:377) > ~[tez-runtime-internals-0.8.4.2.6.4.119-3.jar:0.8.4.2.6.4.119-3] > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) > ~[tez-runtime-internals-0.8.4.2.6.4.119-3.jar:0.8.4.2.6.4.119-3] > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) > ~[tez-runtime-internals-0.8.4.2.6.4.119-3.jar:0.8.4.2.6.4.119-3] > at java.security.AccessController.doPrivileged(Native Method) > ~[?:1.8.0_112] > at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_112] > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869) > ~[hadoop-common-2.7.3.2.6.4.119-3.jar:?] > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) > ~[tez-runtime-internals-0.8.4.2.6.4.119-3.jar:0.8.4.2.6.4.119-3] > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) > ~[tez-runtime-internals-0.8.4.2.6.4.119-3.jar:0.8.4.2.6.4.119-3] > at > org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > ~[tez-common-0.8.4.2.6.4.119-3.jar:0.8.4.2.6.4.119-3] > at > org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118) > ~[hive-llap-server-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > ~[?:1.8.0_112] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > [?:1.8.0_112] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [?:1.8.0_112] > at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112] > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)