[ https://issues.apache.org/jira/browse/HIVE-10083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14388973#comment-14388973 ]
Na Yang commented on HIVE-10083: -------------------------------- Thank you [~csun] for the code review. I ran the q test for smb_mapjoin_8.q on my local machine and it was successful. > SMBJoin fails in case one table is uninitialized > ------------------------------------------------ > > Key: HIVE-10083 > URL: https://issues.apache.org/jira/browse/HIVE-10083 > Project: Hive > Issue Type: Bug > Components: Logical Optimizer > Affects Versions: 0.13.0 > Environment: MapR Hive 0.13 > Reporter: Alain Schröder > Assignee: Na Yang > Priority: Minor > Attachments: HIVE-10083.patch > > > We experience IndexOutOfBoundsException in a SMBJoin in the case on the > tables used for the JOIN is uninitialized. Everything works if both are > uninitialized or initialized. > {code} > 2015-03-24 09:12:58,967 ERROR [main]: ql.Driver > (SessionState.java:printError(545)) - FAILED: IndexOutOfBoundsException > Index: 0, Size: 0 > java.lang.IndexOutOfBoundsException: Index: 0, Size: 0 > at java.util.ArrayList.rangeCheck(ArrayList.java:635) > at java.util.ArrayList.get(ArrayList.java:411) > at > org.apache.hadoop.hive.ql.optimizer.AbstractBucketJoinProc.fillMappingBigTableBucketFileNameToSmallTableBucketFileNames(AbstractBucketJoinProc.java:486) > at > org.apache.hadoop.hive.ql.optimizer.AbstractBucketJoinProc.convertMapJoinToBucketMapJoin(AbstractBucketJoinProc.java:429) > at > org.apache.hadoop.hive.ql.optimizer.AbstractSMBJoinProc.convertJoinToBucketMapJoin(AbstractSMBJoinProc.java:540) > at > org.apache.hadoop.hive.ql.optimizer.AbstractSMBJoinProc.convertJoinToSMBJoin(AbstractSMBJoinProc.java:549) > at > org.apache.hadoop.hive.ql.optimizer.SortedMergeJoinProc.process(SortedMergeJoinProc.java:51) > [...] > {code} > Simplest way to reproduce: > {code} > SET hive.enforce.sorting=true; > SET hive.enforce.bucketing=true; > SET hive.exec.dynamic.partition=true; > SET mapreduce.reduce.import.limit=-1; > SET hive.optimize.bucketmapjoin=true; > SET hive.optimize.bucketmapjoin.sortedmerge=true; > SET hive.auto.convert.join=true; > SET hive.auto.convert.sortmerge.join=true; > SET hive.auto.convert.sortmerge.join.noconditionaltask=true; > CREATE DATABASE IF NOT EXISTS tmp; > USE tmp; > CREATE TABLE `test1` ( > `foo` bigint ) > CLUSTERED BY ( > foo) > SORTED BY ( > foo ASC) > INTO 384 BUCKETS > stored as orc; > CREATE TABLE `test2`( > `foo` bigint ) > CLUSTERED BY ( > foo) > SORTED BY ( > foo ASC) > INTO 384 BUCKETS > STORED AS ORC; > -- Initialize ONE table of the two tables with any data. > INSERT INTO TABLE test1 SELECT foo FROM table_with_some_content LIMIT 100; > SELECT t1.foo, t2.foo > FROM test1 t1 INNER JOIN test2 t2 > ON (t1.foo = t2.foo); > {code} > I took a look at the Procedure > fillMappingBigTableBucketFileNameToSmallTableBucketFileNames in > AbstractBucketJoinProc.java and it does not seem to have changed from our > MapR Hive 0.13 to current snapshot, so this should be also an error in the > current Version. -- This message was sent by Atlassian JIRA (v6.3.4#6332)