Hi all: when I change the table alias dim_pay_date to A, the query pass in hive 0.11(https://gist.github.com/code6/6187569#file-hive11_auto_convert_join_change_alias_pass):
use test; create table if not exists src ( `key` int,`val` string); load data local inpath '/Users/code6/git/hive/data/files/kv1.txt' overwrite into table src; drop table if exists orderpayment_small; create table orderpayment_small (`dealid` int,`date` string,`time` string, `cityid` int, `userid` int); insert overwrite table orderpayment_small select 748, '2011-03-24', '2011-03-24', 55 ,5372613 from src limit 1; drop table if exists user_small; create table user_small( userid int); insert overwrite table user_small select key from src limit 100; set hive.auto.convert.join.noconditionaltask.size = 200; SELECT `A`.`date` , `deal`.`dealid` FROM `orderpayment_small` `orderpayment` JOIN `orderpayment_small` `A` ON `A`.`date` = `orderpayment`.`date` JOIN `orderpayment_small` `deal` ON `deal`.`dealid` = `orderpayment`.`dealid` JOIN `orderpayment_small` `order_city` ON `order_city`.`cityid` = `orderpayment`.`cityid` JOIN `user_small` `user` ON `user`.`userid` = `orderpayment`.`userid` limit 5; It's quite strange and interesting now. I will keep searching for the answer to this issue. 在 2013年8月9日星期五,上午3:32,wzc1...@gmail.com 写道: > Hi all: > I'm currently testing hive11 and encounter one bug with > hive.auto.convert.join, I construct a testcase so everyone can reproduce > it(or you can reach the testcase > here:https://gist.github.com/code6/6187569#file-hive11_auto_convert_join_bug): > > use test; > create table src ( `key` int,`val` string); > load data local inpath '/Users/code6/git/hive/data/files/kv1.txt' overwrite > into table src; > drop table if exists orderpayment_small; > create table orderpayment_small (`dealid` int,`date` string,`time` string, > `cityid` int, `userid` int); > insert overwrite table orderpayment_small select 748, '2011-03-24', > '2011-03-24', 55 ,5372613 from src limit 1; > drop table if exists user_small; > create table user_small( userid int); > insert overwrite table user_small select key from src limit 100; > set hive.auto.convert.join.noconditionaltask.size = 200; > SELECT > `dim_pay_date`.`date` > , `deal`.`dealid` > FROM `orderpayment_small` `orderpayment` > JOIN `orderpayment_small` `dim_pay_date` ON `dim_pay_date`.`date` = > `orderpayment`.`date` > JOIN `orderpayment_small` `deal` ON `deal`.`dealid` = `orderpayment`.`dealid` > JOIN `orderpayment_small` `order_city` ON `order_city`.`cityid` = > `orderpayment`.`cityid` > JOIN `user_small` `user` ON `user`.`userid` = `orderpayment`.`userid` > limit 5; > > > You should replace the path of kv1.txt by yourself. You can run the above > query in hive 0.11 and it will fail with ArrayIndexOutOfBoundsException, You > can see the explain result and the console output of the query here : > https://gist.github.com/code6/6187569 > > I compile the trunk code but it doesn't work with this query. I can run this > query in hive 0.9 with hive.auto.convert.join turns on. > > I try to dig into this problem and I think it may be caused by the map join > optimization. Some adjacent operators aren't match for the input/output > tableinfo(column positions diff). > > I'm not able to fix this bug and I would appreciate it if someone would like > to look into this problem. > > Thanks.