Hi, We're currently experiencing an issue with a query against a table backed by ORC. Nothing special - any query causes it.
We're currently using HDP 2.2.4.x, so Hive 0.14.0.2.2.4.x The error we're seeing in the logs is: Caused by: java.lang.RuntimeException: Error creating a batch at org.apache.hadoop.hive.ql.io.orc.VectorizedOrcInputFormat$VectorizedOrcRecordReader.createValue(VectorizedOrcInputFormat.java:111) at org.apache.hadoop.hive.ql.io.orc.VectorizedOrcInputFormat$VectorizedOrcRecordReader.createValue(VectorizedOrcInputFormat.java:49) at org.apache.hadoop.hive.ql.io.HiveRecordReader.createValue(HiveRecordReader.java:58) at org.apache.hadoop.hive.ql.io.HiveRecordReader.createValue(HiveRecordReader.java:33) at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.createValue(TezGroupedSplitsInputFormat.java:141) at org.apache.tez.mapreduce.lib.MRReaderMapred.setupOldRecordReader(MRReaderMapred.java:150) at org.apache.tez.mapreduce.lib.MRReaderMapred.setSplit(MRReaderMapred.java:80) at org.apache.tez.mapreduce.input.MRInput.initFromEventInternal(MRInput.java:609) at org.apache.tez.mapreduce.input.MRInput.initFromEvent(MRInput.java:588) at org.apache.tez.mapreduce.input.MRInputLegacy.checkAndAwaitRecordReaderInitialization(MRInputLegacy.java:140) at org.apache.tez.mapreduce.input.MRInputLegacy.init(MRInputLegacy.java:109) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getMRInput(MapRecordProcessor.java:361) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:134) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:162) ... 13 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: No type found for column type entry 584 at org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatchCtx.addScratchColumnsToBatch(VectorizedRowBatchCtx.java:604) at org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatchCtx.createVectorizedRowBatch(VectorizedRowBatchCtx.java:339) at org.apache.hadoop.hive.ql.io.orc.VectorizedOrcInputFormat$VectorizedOrcRecordReader.createValue(VectorizedOrcInputFormat.java:109) ... 26 more Not sure if it has any relevance, but when doing orcfiledump of various ORC files being used in the query some are DICTIONARY_V2 and some are DIRECT_V2 encoded, depending on the data for column 584. We can workaround it by disabling hive.vectorized.execution.enabled. Has anyone else experienced anything similar? Thanks -Dave