Sorry, I mean to put this stack trace instead. java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row [Error getting row data with exception java.lang.ArrayIndexOutOfBoundsException: 385986740 at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.checkObjectByteInfo(LazyBinaryUtils.java:180) at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.parse(LazyBinaryStruct.java:138) at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.getField(LazyBinaryStruct.java:195) at org.apache.hadoop.hive.serde2.lazybinary.objectinspector.LazyBinaryStructObjectInspector.getStructFieldData(LazyBinaryStructObjectInspector.java:61) at org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:371) at org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:236) at org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:222) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:663) at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:149) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:441) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:377) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132) at org.apache.hadoop.mapred.Child.main(Child.java:249) ] at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:167) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:441) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:377) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row [Error getting row data with exception java.lang.ArrayIndexOutOfBoundsException: 385986740 at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.checkObjectByteInfo(LazyBinaryUtils.java:180) at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.parse(LazyBinaryStruct.java:138) at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.getField(LazyBinaryStruct.java:195) at org.apache.hadoop.hive.serde2.lazybinary.objectinspector.LazyBinaryStructObjectInspector.getStructFieldData(LazyBinaryStructObjectInspector.java:61) at org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:371) at org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:236) at org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:222) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:663) at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:149) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:441) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:377) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132) at org.apache.hadoop.mapred.Child.main(Child.java:249) ] at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:671) at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:149) ... 8 more Caused by: java.lang.ArrayIndexOutOfBoundsException: 385986740 at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.checkObjectByteInfo(LazyBinaryUtils.java:180) at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.parse(LazyBinaryStruct.java:138) at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.getField(LazyBinaryStruct.java:195) at org.apache.hadoop.hive.serde2.lazybinary.objectinspector.LazyBinaryStructObjectInspector.getStructFieldData(LazyBinaryStructObjectInspector.java:61) at org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.evaluate(ExprNodeColumnEvaluator.java:98) at org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:234) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:502) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:832) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:90) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:502) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:832) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:652) ... 9 more
On Wed, Sep 25, 2013 at 2:16 PM, Steven Wong <sw...@netflix.com> wrote: > For me, the bug exhibits itself in Hive 0.11 as the following stack trace. > I'm putting it here so that people searching on a similar problem can find > this discussion thread in a web search. The discussion thread contains a > workaround and a patch. > > java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: > Hive Runtime Error while processing row (tag=0) [Error getting row data with > exception java.lang.ArrayIndexOutOfBoundsException: 175 > at > org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.readVInt(LazyBinaryUtils.java:287) > at > org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.checkObjectByteInfo(LazyBinaryUtils.java:188) > at > org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.parse(LazyBinaryStruct.java:138) > at > org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.getField(LazyBinaryStruct.java:195) > at > org.apache.hadoop.hive.serde2.lazybinary.objectinspector.LazyBinaryStructObjectInspector.getStructFieldData(LazyBinaryStructObjectInspector.java:61) > at > org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:343) > at > org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:343) > at > org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:213) > at > org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:251) > at > org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:522) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:423) > at org.apache.hadoop.mapred.Child$4.run(Child.java:266) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Unknown Source) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093) > at org.apache.hadoop.mapred.Child.main(Child.java:260) > ] > at > org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:268) > at > org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:522) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:423) > at org.apache.hadoop.mapred.Child$4.run(Child.java:266) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Unknown Source) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093) > at org.apache.hadoop.mapred.Child.main(Child.java:260) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime > Error while processing row (tag=0) [Error getting row data with exception > java.lang.ArrayIndexOutOfBoundsException: 175 > at > org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.readVInt(LazyBinaryUtils.java:287) > at > org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.checkObjectByteInfo(LazyBinaryUtils.java:188) > at > org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.parse(LazyBinaryStruct.java:138) > at > org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.getField(LazyBinaryStruct.java:195) > at > org.apache.hadoop.hive.serde2.lazybinary.objectinspector.LazyBinaryStructObjectInspector.getStructFieldData(LazyBinaryStructObjectInspector.java:61) > at > org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:343) > at > org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:343) > at > org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:213) > at > org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:251) > at > org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:522) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:423) > at org.apache.hadoop.mapred.Child$4.run(Child.java:266) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Unknown Source) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093) > at org.apache.hadoop.mapred.Child.main(Child.java:260) > ] > at > org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:256) > ... 7 more > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.ArrayIndexOutOfBoundsException: 175 > at > org.apache.hadoop.hive.ql.exec.JoinOperator.processOp(JoinOperator.java:131) > at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471) > at > org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:247) > ... 7 more > Caused by: java.lang.ArrayIndexOutOfBoundsException: 175 > at > org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.readVInt(LazyBinaryUtils.java:287) > at > org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.checkObjectByteInfo(LazyBinaryUtils.java:188) > at > org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.parse(LazyBinaryStruct.java:138) > at > org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.getField(LazyBinaryStruct.java:195) > at > org.apache.hadoop.hive.serde2.lazybinary.objectinspector.LazyBinaryStructObjectInspector.getStructFieldData(LazyBinaryStructObjectInspector.java:61) > at > org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.evaluate(ExprNodeColumnEvaluator.java:102) > at > org.apache.hadoop.hive.ql.exec.JoinUtil.computeValues(JoinUtil.java:243) > at > org.apache.hadoop.hive.ql.exec.JoinOperator.processOp(JoinOperator.java:82) > ... 9 more > > > > On Mon, Sep 16, 2013 at 5:20 AM, Sun, Rui <rui....@intel.com> wrote: > >> Hi, Amit,**** >> >> ** ** >> >> You can see the description of HIVE-5256 for more detailed explanation.** >> ** >> >> ** ** >> >> Both table aliases and names (if no alias) may run into this issue.**** >> >> ** ** >> >> This issue happened to be covered by the XML >> serialization/deserialization of the MapredWork containing the join >> operator (HashMap serialization/deserialization will reverse the order of >> key-value pairs in the same bucket) and was exposed by HIVE-4078 because >> the copy of Mapredwork in the case of noconditionaltask optimization was >> optimized off. **** >> >> ** ** >> >> ** ** >> >> *From:* Amit Sharma [mailto:amsha...@netflix.com] >> *Sent:* Friday, September 13, 2013 6:05 AM >> *To:* user@hive.apache.org >> *Subject:* Re: 回复: hive 0.11 auto convert join bug report**** >> >> ** ** >> >> Hi Navis,**** >> >> ** ** >> >> I was trying to look at this email thread as well as the jira to >> understand the scope of this issue. Does this get triggered only in cases >> of using aliases which end up mapping to the same value upon hashing? Or >> can this be triggered under other conditions as well? What if the aliases >> are not used and the table names some how might map to similar hashcode >> values?**** >> >> ** ** >> >> Also is changing the alias the only workaround for this problem or is >> there any other workaround possible?**** >> >> ** ** >> >> Thanks, >> Amit**** >> >> ** ** >> >> On Sun, Aug 11, 2013 at 9:22 PM, Navis류승우 <navis....@nexr.com> wrote:**** >> >> Hi, >> >> Hive is notorious making different result with different aliases. >> Changing alias was a final way to avoid bug in desperate situation. >> >> I think the patch in the issue is ready, wish it's helpful. >> >> Thanks. >> >> 2013/8/11 <wzc1...@gmail.com>:**** >> >> > Hi Navis, >> > >> > My colleague chenchun finds that hashcode of 'deal' and 'dim_pay_date' >> are >> > the same and the code in MapJoinProcessor.java ignores the order of >> > rowschema. >> > I look at your patch and it's exactly the same place we are working on. >> > Thanks for your patch. >> > >> > 在 2013年8月11日星期日,下午9:38,Navis류승우 写道: >> > >> > Hi, >> > >> > I've booked this on https://issues.apache.org/jira/browse/HIVE-5056 >> > and attached patch for it. >> > >> > It needs full test for confirmation but you can try it. >> > >> > Thanks. >> > >> > 2013/8/11 <wzc1...@gmail.com>: >> > >> > Hi all: >> > when I change the table alias dim_pay_date to A, the query pass in hive >> > 0.11( >> https://gist.github.com/code6/6187569#file-hive11_auto_convert_join_change_alias_pass >> ): >> > >> > use test; >> > create table if not exists src ( `key` int,`val` string); >> > load data local inpath '/Users/code6/git/hive/data/files/kv1.txt' >> overwrite >> > into table src; >> > drop table if exists orderpayment_small; >> > create table orderpayment_small (`dealid` int,`date` string,`time` >> string, >> > `cityid` int, `userid` int); >> > insert overwrite table orderpayment_small select 748, '2011-03-24', >> > '2011-03-24', 55 ,5372613 from src limit 1; >> > drop table if exists user_small; >> > create table user_small( userid int); >> > insert overwrite table user_small select key from src limit 100; >> > set hive.auto.convert.join.noconditionaltask.size = 200; >> > SELECT >> > `A`.`date` >> > , `deal`.`dealid` >> > FROM `orderpayment_small` `orderpayment` >> > JOIN `orderpayment_small` `A` ON `A`.`date` = `orderpayment`.`date` >> > JOIN `orderpayment_small` `deal` ON `deal`.`dealid` = >> > `orderpayment`.`dealid` >> > JOIN `orderpayment_small` `order_city` ON `order_city`.`cityid` = >> > `orderpayment`.`cityid` >> > JOIN `user_small` `user` ON `user`.`userid` = `orderpayment`.`userid` >> > limit 5; >> > >> > >> > It's quite strange and interesting now. I will keep searching for the >> answer >> > to this issue. >> > >> > >> > >> > 在 2013年8月9日星期五,上午3:32,wzc1...@gmail.com 写道: >> > >> > Hi all: >> > I'm currently testing hive11 and encounter one bug with >> > hive.auto.convert.join, I construct a testcase so everyone can reproduce >> > it(or you can reach the testcase >> > here: >> https://gist.github.com/code6/6187569#file-hive11_auto_convert_join_bug): >> > >> > use test; >> > create table src ( `key` int,`val` string); >> > load data local inpath '/Users/code6/git/hive/data/files/kv1.txt' >> overwrite >> > into table src; >> > drop table if exists orderpayment_small; >> > create table orderpayment_small (`dealid` int,`date` string,`time` >> string, >> > `cityid` int, `userid` int); >> > insert overwrite table orderpayment_small select 748, '2011-03-24', >> > '2011-03-24', 55 ,5372613 from src limit 1; >> > drop table if exists user_small; >> > create table user_small( userid int); >> > insert overwrite table user_small select key from src limit 100; >> > set hive.auto.convert.join.noconditionaltask.size = 200; >> > SELECT >> > `dim_pay_date`.`date` >> > , `deal`.`dealid` >> > FROM `orderpayment_small` `orderpayment` >> > JOIN `orderpayment_small` `dim_pay_date` ON `dim_pay_date`.`date` = >> > `orderpayment`.`date` >> > JOIN `orderpayment_small` `deal` ON `deal`.`dealid` = >> > `orderpayment`.`dealid` >> > JOIN `orderpayment_small` `order_city` ON `order_city`.`cityid` = >> > `orderpayment`.`cityid` >> > JOIN `user_small` `user` ON `user`.`userid` = `orderpayment`.`userid` >> > limit 5; >> > >> > >> > You should replace the path of kv1.txt by yourself. You can run the >> above >> > query in hive 0.11 and it will fail with >> ArrayIndexOutOfBoundsException, You >> > can see the explain result and the console output of the query here : >> > https://gist.github.com/code6/6187569 >> > >> > I compile the trunk code but it doesn't work with this query. I can run >> this >> > query in hive 0.9 with hive.auto.convert.join turns on. >> > >> > I try to dig into this problem and I think it may be caused by the map >> join >> > optimization. Some adjacent operators aren't match for the input/output >> > tableinfo(column positions diff). >> > >> > I'm not able to fix this bug and I would appreciate it if someone would >> like >> > to look into this problem. >> > >> > Thanks. >> > >> >**** >> >> ** ** >> > >