Re: skew join optimization

2011-03-20 Thread Igor Tatarinov
Thanks everyone! I had a typo when setting auto convert to true. You can actually see it in my first email ('set' was repeated twice but there was no syntax error). With map joins enabled, my join finished in 30 minutes. Sweet! Looks like 'true' should be the default option for auto.convert Anot

Re: skew join optimization

2011-03-20 Thread yongqiang he
skew join does not work together with map join. Map join does not require any reducer. Please double check the hive that you use has the auto map join feature. If there is auto covert join is your hive, only SET set hive.auto.convert.join = true; should do the work. thanks yongqiang On Sun, Mar 2

Problems with MetaStore

2011-03-20 Thread Anja Gruenheid
Hi! I'm trying to set up a test environment locally on my laptop and it works if I use the standard embedded derby driver. As soon as I add a hive-site.xml, I tried both MySQL and Derby with servers definitely running and the respective parameters set in that hive-site.xml, I get the followin

Re: skew join optimization

2011-03-20 Thread Edward Capriolo
On Sun, Mar 20, 2011 at 11:20 AM, Ted Yu wrote: > How about link to http://imageshack.us/ or TinyPic ? > > Thanks > > On Sun, Mar 20, 2011 at 7:56 AM, Edward Capriolo > wrote: >> >> On Sun, Mar 20, 2011 at 10:30 AM, Ted Yu wrote: >> > Can someone re-attach the missing figures for that wiki ? >>

Re: skew join optimization

2011-03-20 Thread Ted Yu
How about link to http://imageshack.us/ or TinyPic ? Thanks On Sun, Mar 20, 2011 at 7:56 AM, Edward Capriolo wrote: > On Sun, Mar 20, 2011 at 10:30 AM, Ted Yu wrote: > > Can someone re-attach the missing figures for that wiki ? > > > > Thanks > > > > On Sun, Mar 20, 2011 at 7:15 AM, bharath vis

Re: skew join optimization

2011-03-20 Thread Edward Capriolo
On Sun, Mar 20, 2011 at 10:30 AM, Ted Yu wrote: > Can someone re-attach the missing figures for that wiki ? > > Thanks > > On Sun, Mar 20, 2011 at 7:15 AM, bharath vissapragada > wrote: >> >> Hi Igor, >> >> See http://wiki.apache.org/hadoop/Hive/JoinOptimization and see the >> jira 1642 which aut

Re: skew join optimization

2011-03-20 Thread Ted Yu
Can someone re-attach the missing figures for that wiki ? Thanks On Sun, Mar 20, 2011 at 7:15 AM, bharath vissapragada < bharathvissapragada1...@gmail.com> wrote: > Hi Igor, > > See http://wiki.apache.org/hadoop/Hive/JoinOptimization and see the > jira 1642 which automatically converts a normal

Re: skew join optimization

2011-03-20 Thread bharath vissapragada
Hi Igor, See http://wiki.apache.org/hadoop/Hive/JoinOptimization and see the jira 1642 which automatically converts a normal join into map-join (Otherwise you can specify the mapjoin hints in the query itself.). Because your 'S' table is very small , it can be replicated across all the mappers and

Re: skew join optimization

2011-03-20 Thread Jov
2011/3/20 Igor Tatarinov : > I have the following join that takes 4.5 hours (with 12 nodes) mostly > because of a single reduce task that gets the bulk of the work: > SELECT ... > FROM T > LEFT OUTER JOIN S > ON T.timestamp = S.timestamp and T.id = S.id > This is a 1:0/1 join so the size of the out