[ https://issues.apache.org/jira/browse/HIVE-5845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13826719#comment-13826719 ]
Remus Rusanu commented on HIVE-5845: ------------------------------------ Hello Ashutosh, I’ve looked at this and my opinion is that the problem is with the Orc’ VectorizedSerde.serialize. Despite the fact that we’re writing an OrcStruct field, it adds to the OrcSerde object created the passed in object inspector, which is for the input struct, instead of the OrcStructInspectr which should be used with the created OrcStruct. I tried this patch: diff --git ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcSerde.java ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcSerde.java index d765353..c4268c1 100644 --- ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcSerde.java +++ ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcSerde.java @@ -143,9 +143,9 @@ public SerDeStats getSerDeStats() { public Writable serializeVector(VectorizedRowBatch vrg, ObjectInspector objInspector) throws SerDeException { if (vos == null) { - vos = new VectorizedOrcSerde(objInspector); + vos = new VectorizedOrcSerde(getObjectInspector()); } - return vos.serialize(vrg, objInspector); + return vos.serialize(vrg, getObjectInspector()); } However, with this fix I’m hitting other (very familiar…) cast exceptions: Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.serde2.io.TimestampWritable cannot be cast to java.sql.Timestamp at org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaTimestampObjectInspector.getPrimitiveJavaObject(JavaTimestampObjectInspector.java:39) at org.apache.hadoop.hive.ql.io.orc.WriterImpl$TimestampTreeWriter.write(WriterImpl.java:1172) at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.write(WriterImpl.java:1349) at org.apache.hadoop.hive.ql.io.orc.WriterImpl.addRow(WriterImpl.java:1962) at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.write(OrcOutputFormat.java:78) at org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator.processOp(VectorFileSinkOperator.java:159) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:489) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:827) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:129) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:489) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:827) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:91) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:489) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:827) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43) Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.serde2.io.ByteWritable cannot be cast to org.apache.hadoop.io.IntWritable at org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableIntObjectInspector.get(WritableIntObjectInspector.java:36) at org.apache.hadoop.hive.ql.io.orc.WriterImpl$IntegerTreeWriter.write(WriterImpl.java:762) at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.write(WriterImpl.java:1349) at org.apache.hadoop.hive.ql.io.orc.WriterImpl.addRow(WriterImpl.java:1962) at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.write(OrcOutputFormat.java:78) at org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator.processOp(VectorFileSinkOperator.java:159) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:489) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:827) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:129) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:489) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:827) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:91) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:489) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:827) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43) Before I go and hack through code I’m only vaguely familiar with (the Orc serdes), do you have someone more experienced in this area at HW to have a look too? It seems that the Orc writer expects Java primitive types where the vector file sink creates Writables instead… I’m afraid if I ‘fix’ this one way, some other place will break. Thanks, ~Remus From: Ashutosh Chauhan (JIRA) [mailto:j...@apache.org] Sent: Tuesday, November 19, 2013 1:11 AM To: Remus Rusanu Subject: [jira] [Commented] (HIVE-5845) CTAS failed on vectorized code path [https://issues.apache.org/jira/secure/useravatar?avatarId=10452] Ashutosh Chauhan<https://issues.apache.org/jira/secure/ViewProfile.jspa?name=ashutoshc> commented on an issue Re: CTAS failed on vectorized code path<https://issues.apache.org/jira/browse/HIVE-5845> Stack-trace: Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.ql.io.orc.OrcStruct cannot be cast to [Ljava.lang.Object; at org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldData(StandardStructObjectInspector.java:173) at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.write(WriterImpl.java:1349) at org.apache.hadoop.hive.ql.io.orc.WriterImpl.addRow(WriterImpl.java:1962) at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.write(OrcOutputFormat.java:78) at org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator.processOp(VectorFileSinkOperator.java:159) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:489) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:827) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:129) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:489) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:827) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:91) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:489) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:827) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43) [Add Comment]<https://issues.apache.org/jira/browse/HIVE-5845#add-comment> Add Comment<https://issues.apache.org/jira/browse/HIVE-5845#add-comment> Hive<https://issues.apache.org/jira/browse/HIVE> / [Bug] <https://issues.apache.org/jira/browse/HIVE-5845> HIVE-5845<https://issues.apache.org/jira/browse/HIVE-5845> CTAS failed on vectorized code path<https://issues.apache.org/jira/browse/HIVE-5845> Following query fails: create table store_sales_2 stored as orc as select * from alltypesorc; This message was sent by Atlassian JIRA (v6.1#6144-sha1:2e50328) [Atlassian logo] > CTAS failed on vectorized code path > ----------------------------------- > > Key: HIVE-5845 > URL: https://issues.apache.org/jira/browse/HIVE-5845 > Project: Hive > Issue Type: Bug > Reporter: Ashutosh Chauhan > Assignee: Remus Rusanu > > Following query fails: > create table store_sales_2 stored as orc as select * from alltypesorc; -- This message was sent by Atlassian JIRA (v6.1#6144)