[ https://issues.apache.org/jira/browse/HIVE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chao Sun updated HIVE-16698: ---------------------------- Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Test failure unrelated. Committed to master. Thanks Xuefu for the review! > HoS should avoid mapjoin optimization in case of union and using table stats > ---------------------------------------------------------------------------- > > Key: HIVE-16698 > URL: https://issues.apache.org/jira/browse/HIVE-16698 > Project: Hive > Issue Type: Bug > Components: Logical Optimizer, Spark > Affects Versions: 3.0.0 > Reporter: Chao Sun > Assignee: Chao Sun > Fix For: 3.0.0 > > Attachments: HIVE-16698.1.patch > > > When {{hive.spark.use.ts.stats.for.mapjoin}} is true, HoS would not check > whether the big table branch has upstream UNION operators. This is wrong and > could generate incorrect plan. To reproduce: > {code} > set hive.auto.convert.join=true; > set hive.auto.convert.join.noconditionaltask.size=16; > set hive.spark.use.ts.stats.for.mapjoin=true; > create table a (c1 string, c2 int); > create table b (c3 string, c4 int); > create table c (c1 string, c2 int); > create table d (c3 string, c4 int); > create table e (c5 string, c6 int); > insert into table a values > ("a1", 1), ("a2", 2), ("a3", 3), ("a4", 4), ("a5", 5), ("a6", 6), ("a7", 7); > insert into table b values > ("b1", 1), ("b2", 2), ("b3", 3), ("b4", 4); > insert into table c values > ("c1", 1), ("c2", 2), ("c3", 3), ("c4", 4), ("c5", 5), ("c6", 6), ("c7", 7); > insert into table d values > ("d1", 1), ("d2", 2), ("d3", 3), ("d4", 4); > insert into table e values > ("d1", 1), ("d2", 2); > explain > with t1 as ( > select a.c1 as c1, a.c2 as c2, b.c3 as c3 from a join b on a.c2 = b.c4 > ), > t2 as ( > select c.c1 as c1, c.c2 as c2, d.c3 as c3 from c join d on c.c2 = d.c4 > ), > t3 as ( > select * from t1 union all select * from t2 > ), > t4 as ( > select t3.c1, t3.c3, t5.c5 from t3 join e as t5 on t3.c2 = t5.c6 > ) > select * from t4; > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)