Yes. Only execute one of the them. On 27 December 2010 23:43, <ash...@gmail.com> wrote:
> A question about the design doc: > > "If one of the tables is large and others are small enough to run Map Join, > then the Conditional Task will pick the corresponding Map Join Local Task > to > run." > Here you pick one table as big, hash all other tables into memory by join > key individually. If it works, it will be chosen. Can it be that if one of > the rest candidate tasks (chosing a different table as big table) performs > better (assuming), it will not be considered? Did I understand it right? > Thanks! > Alicia > > On Mon, Dec 27, 2010 at 12:05 AM, Liyin Tang <liyint...@gmail.com> wrote: > > > Hi, > > If multiple tables join on different join keys, it will be separated into > > multiple MapRed Tasks. > > Also the threshold of the small table file size means the sum of all the > > small table. > > > > There is a documentation and a slide about this feature: > > http://www.slideshare.net/aiolos127/join-optimization-in-hive > > <http://www.slideshare.net/aiolos127/join-optimization-in-hive> > > http://wiki.apache.org/hadoop/Hive/JoinOptimization > > > > <http://wiki.apache.org/hadoop/Hive/JoinOptimization>Thanks > > > > On 26 December 2010 23:42, <ash...@gmail.com> wrote: > > > > > Thanks for the reply. I want to get clarification on this feature. > > > > > > If one of the two joining tables table t1 is smaller than 25M and is > > > sharded, how does this feature work? > > > > > > > Suppose there are joins on multiple tables such as t1, t2 and t3. If t1 > > and > > > t2 are smaller than 25M and co-located with joining key, you can > stream > > t3 > > > and map join them. What if t1 and t2 are not co-located with join keys? > > > What > > > if t1 and t2 together are bigger than 25M? > > > > > > Thanks in advance! > > > On Thu, Dec 23, 2010 at 7:16 PM, Liyin Tang <liyint...@gmail.com> > wrote: > > > > > > > Hi, > > > > How large is t1 and t2 ? > > > > if both of t1 and t2 is larger than 25M (a default threshold), the > > query > > > > processor will do the common join. > > > > > > > > Thanks > > > > Liyin > > > > > > > > On 23 December 2010 18:50, <ash...@gmail.com> wrote: > > > > > > > > > Hi, > > > > > > > > > > I set hive.auto.convert.join=true and run the following query: > > > > > > > > > > select t1.foo, count(t2.bar) from invites t1 join invites t2 on > > > > > (t1.foo=t2.foo) group by t1.foo; > > > > > > > > > > I did not see it ran as map side join. Did I miss something? Is > there > > > any > > > > > precondition for this feature to work? > > > > > > > > > > Thanks. > > > > > > > > > > > > > > > > > > > > > -- > > > > Best Regards > > > > -Liyin > > > > > > > > > > > > > > > -- > > Best Regards > > -Liyin > > > -- Best Regards -Liyin