[ https://issues.apache.org/jira/browse/HIVE-14564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15445006#comment-15445006 ]
zhihai xu commented on HIVE-14564: ---------------------------------- Thanks for the review, [~ashutoshc]! a lot of test cases are updated to adapt to this patch, Looks like all these cases can verify this patch. > Column Pruning generates out of order columns in SelectOperator which cause > ArrayIndexOutOfBoundsException. > ----------------------------------------------------------------------------------------------------------- > > Key: HIVE-14564 > URL: https://issues.apache.org/jira/browse/HIVE-14564 > Project: Hive > Issue Type: Bug > Components: Query Planning > Affects Versions: 2.1.0 > Reporter: zhihai xu > Assignee: zhihai xu > Priority: Critical > Attachments: HIVE-14564.000.patch, HIVE-14564.001.patch > > > Column Pruning generates out of order columns in SelectOperator which cause > ArrayIndexOutOfBoundsException. > {code} > 2016-07-26 21:49:24,390 FATAL [main] > org.apache.hadoop.hive.ql.exec.mr.ExecMapper: > org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while > processing row > at > org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:507) > at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.ArrayIndexOutOfBoundsException > at > org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:397) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) > at > org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95) > at > org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157) > at > org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497) > ... 9 more > Caused by: java.lang.ArrayIndexOutOfBoundsException > at java.lang.System.arraycopy(Native Method) > at org.apache.hadoop.io.Text.set(Text.java:225) > at > org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryString.init(LazyBinaryString.java:48) > at > org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.uncheckedGetField(LazyBinaryStruct.java:264) > at > org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.getField(LazyBinaryStruct.java:201) > at > org.apache.hadoop.hive.serde2.lazybinary.objectinspector.LazyBinaryStructObjectInspector.getStructFieldData(LazyBinaryStructObjectInspector.java:64) > at > org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator._evaluate(ExprNodeColumnEvaluator.java:94) > at > org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) > at > org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65) > at > org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.makeValueWritable(ReduceSinkOperator.java:550) > at > org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:377) > ... 13 more > {code} > The exception is because the serialization and deserialization doesn't match. > The serialization by LazyBinarySerDe from previous MapReduce job used > different order of columns. When the current MapReduce job deserialized the > intermediate sequence file generated by previous MapReduce job, it will get > corrupted data from the deserialization using wrong order of columns by > LazyBinaryStruct. The unmatched columns between serialization and > deserialization is caused by SelectOperator's Column Pruning > {{ColumnPrunerSelectProc}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)