Thanks!
On Mon, Dec 27, 2010 at 11:56 PM, Liyin Tang wrote:
> Yes. Only execute one of the them.
>
> On 27 December 2010 23:43, wrote:
>
> > A question about the design doc:
> >
> > "If one of the tables is large and others are small enough to run Map
> Join,
> > then the Conditional Task will
Yes. Only execute one of the them.
On 27 December 2010 23:43, wrote:
> A question about the design doc:
>
> "If one of the tables is large and others are small enough to run Map Join,
> then the Conditional Task will pick the corresponding Map Join Local Task
> to
> run."
> Here you pick one tab
A question about the design doc:
"If one of the tables is large and others are small enough to run Map Join,
then the Conditional Task will pick the corresponding Map Join Local Task to
run."
Here you pick one table as big, hash all other tables into memory by join
key individually. If it works, i
Hi,
If multiple tables join on different join keys, it will be separated into
multiple MapRed Tasks.
Also the threshold of the small table file size means the sum of all the
small table.
There is a documentation and a slide about this feature:
http://www.slideshare.net/aiolos127/join-optimization
Thanks for the reply. I want to get clarification on this feature.
If one of the two joining tables table t1 is smaller than 25M and is
sharded, how does this feature work?
Suppose there are joins on multiple tables such as t1, t2 and t3. If t1 and
t2 are smaller than 25M and co-located with joi
Hi,
How large is t1 and t2 ?
if both of t1 and t2 is larger than 25M (a default threshold), the query
processor will do the common join.
Thanks
Liyin
On 23 December 2010 18:50, wrote:
> Hi,
>
> I set hive.auto.convert.join=true and run the following query:
>
> select t1.foo, count(t2.bar) from