[ https://issues.apache.org/jira/browse/HIVE-3306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13427069#comment-13427069 ]
Lianhui Wang commented on HIVE-3306: ------------------------------------ also i think there have another context. example:ON (a.key = b.key and a.key=10) this should scan the 10 bucket's file.not all bucket's file in the table's path. > SMBJoin/BucketMapJoin should be allowed only when join key expression is > exactly matches with sort/cluster key > -------------------------------------------------------------------------------------------------------------- > > Key: HIVE-3306 > URL: https://issues.apache.org/jira/browse/HIVE-3306 > Project: Hive > Issue Type: Bug > Components: Query Processor > Affects Versions: 0.10.0 > Reporter: Navis > Assignee: Navis > Priority: Minor > > CREATE TABLE bucket_small (key int, value string) CLUSTERED BY (key) SORTED > BY (key) INTO 2 BUCKETS STORED AS TEXTFILE; > load data local inpath > '/home/navis/apache/oss-hive/data/files/srcsortbucket1outof4.txt' INTO TABLE > bucket_small; > load data local inpath > '/home/navis/apache/oss-hive/data/files/srcsortbucket2outof4.txt' INTO TABLE > bucket_small; > CREATE TABLE bucket_big (key int, value string) CLUSTERED BY (key) SORTED BY > (key) INTO 4 BUCKETS STORED AS TEXTFILE; > load data local inpath > '/home/navis/apache/oss-hive/data/files/srcsortbucket1outof4.txt' INTO TABLE > bucket_big; > load data local inpath > '/home/navis/apache/oss-hive/data/files/srcsortbucket2outof4.txt' INTO TABLE > bucket_big; > load data local inpath > '/home/navis/apache/oss-hive/data/files/srcsortbucket3outof4.txt' INTO TABLE > bucket_big; > load data local inpath > '/home/navis/apache/oss-hive/data/files/srcsortbucket4outof4.txt' INTO TABLE > bucket_big; > select count(*) FROM bucket_small a JOIN bucket_big b ON a.key + a.key = > b.key; > select /* + MAPJOIN(a) */ count(*) FROM bucket_small a JOIN bucket_big b ON > a.key + a.key = b.key; > returns 116 (same) > But with BucketMapJoin or SMBJoin, it returns 61. But this should not be > allowed cause hash(a.key) != hash(a.key + a.key). > Bucket context should be utilized only with exact matching join expression > with sort/cluster key. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira