Hi. I'm using Hive 3.1.0 (Tez Execution Engine) and I'm running into intermittent errors when doing Hive Merge.
Just to clarify, the Hive Merge query probably succeeds 60% of the time using the same source and destination table for the Hive Merge query. By the way, both the source and destination table has columns with complex data types such as ARRAY<STRING> and MAP<STRING, STRING>. Here's the error : TaskAttempt 0 failed, info= ยป Error: Error while running task ( failure ) : attempt_1594345704665_28139_1_06_000007_0:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing vector batch (tag=0) (vectorizedVertexNum 4) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108) at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41) at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing vector batch (tag=0) (vectorizedVertexNum 4) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecordVector(ReduceRecordSource.java:396) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:249) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:318) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267) ... 16 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing vector batch (tag=0) (vectorizedVertexNum 4) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:493) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecordVector(ReduceRecordSource.java:387) ... 19 more Caused by: java.lang.ArrayIndexOutOfBoundsException: 1024 at org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.setVal(BytesColumnVector.java:187) at org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.storePrimitiveRowColumn(VectorDeserializeRow.java:588) at org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.storeComplexFieldRowColumn(VectorDeserializeRow.java:778) at org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.storeMapRowColumn(VectorDeserializeRow.java:855) at org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.storeRowColumn(VectorDeserializeRow.java:941) at org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.deserialize(VectorDeserializeRow.java:1360) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:470) ... 20 more Would someone know a workaround for this? Thanks, Bernard