回复： hive 0.11 auto convert join bug report

wzc1989 Sun, 11 Aug 2013 00:51:54 -0700

Hi all:
when I change the table alias dim_pay_date to A, the query pass in hive 
0.11(https://gist.github.com/code6/6187569#file-hive11_auto_convert_join_change_alias_pass):


use test;
create table if not exists src ( `key` int,`val` string);
load data local inpath '/Users/code6/git/hive/data/files/kv1.txt' overwrite 
into table src;
drop table if exists orderpayment_small;
create table orderpayment_small (`dealid` int,`date` string,`time` string, 
`cityid` int, `userid` int);
insert overwrite table orderpayment_small select 748, '2011-03-24', 
'2011-03-24', 55 ,5372613 from src limit 1;
drop table if exists user_small;
create table user_small( userid int);
insert overwrite table user_small select key from src limit 100;
set hive.auto.convert.join.noconditionaltask.size = 200;
SELECT
`A`.`date`
, `deal`.`dealid`
FROM `orderpayment_small` `orderpayment`
JOIN `orderpayment_small` `A` ON `A`.`date` = `orderpayment`.`date`
JOIN `orderpayment_small` `deal` ON `deal`.`dealid` = `orderpayment`.`dealid`
JOIN `orderpayment_small` `order_city` ON `order_city`.`cityid` = 
`orderpayment`.`cityid`
JOIN `user_small` `user` ON `user`.`userid` = `orderpayment`.`userid`
limit 5;



It's quite strange and interesting now. I will keep searching for the answer to 
this issue.




在 2013年8月9日星期五，上午3:32，wzc1...@gmail.com 写道：

> Hi all:  
> I'm currently testing hive11 and encounter one bug with 
> hive.auto.convert.join, I construct a testcase so everyone can reproduce 
> it(or you can reach the testcase 
> here:https://gist.github.com/code6/6187569#file-hive11_auto_convert_join_bug):
>  
> use test;
> create table src ( `key` int,`val` string);
> load data local inpath '/Users/code6/git/hive/data/files/kv1.txt' overwrite 
> into table src;
> drop table if exists orderpayment_small;
> create table orderpayment_small (`dealid` int,`date` string,`time` string, 
> `cityid` int, `userid` int);
> insert overwrite table orderpayment_small select 748, '2011-03-24', 
> '2011-03-24', 55 ,5372613 from src limit 1;
> drop table if exists user_small;
> create table user_small( userid int);
> insert overwrite table user_small select key from src limit 100;
> set hive.auto.convert.join.noconditionaltask.size = 200;
> SELECT
> `dim_pay_date`.`date`
> , `deal`.`dealid`
> FROM `orderpayment_small` `orderpayment`
> JOIN `orderpayment_small` `dim_pay_date` ON `dim_pay_date`.`date` = 
> `orderpayment`.`date`
> JOIN `orderpayment_small` `deal` ON `deal`.`dealid` = `orderpayment`.`dealid`
> JOIN `orderpayment_small` `order_city` ON `order_city`.`cityid` = 
> `orderpayment`.`cityid`
> JOIN `user_small` `user` ON `user`.`userid` = `orderpayment`.`userid`
> limit 5;
>  
>  
> You should replace the path of kv1.txt by yourself. You can run the above 
> query in hive 0.11 and it will fail with ArrayIndexOutOfBoundsException, You 
> can see the explain result and the console output of the query here : 
> https://gist.github.com/code6/6187569
>  
> I compile the trunk code but it doesn't work with this query. I can run this 
> query in hive 0.9 with hive.auto.convert.join turns on.
>  
> I try to dig into this problem and I think it may be caused by the map join 
> optimization. Some adjacent operators aren't match for the input/output 
> tableinfo(column positions diff).  
>  
> I'm not able to fix this bug and I would appreciate it if someone would like 
> to look into this problem.
>  
> Thanks.

回复： hive 0.11 auto convert join bug report

Reply via email to