Re: Tuning Triangle Joins on Hive

2014-08-06 Thread Firas Abuzaid
Thanks, Gopal! That helps a lot! --Firas On Wed, Aug 6, 2014 at 2:03 PM, Gopal V wrote: > On 7/31/14, 12:28 PM, Firas Abuzaid wrote: > > We're running various "triangle" join queries on Hive 0.9.0, and we're >> wondering if we can get any better performance. Here's the query we're >> running:

Re: Tuning Triangle Joins on Hive

2014-08-06 Thread Gopal V
On 7/31/14, 12:28 PM, Firas Abuzaid wrote: We're running various "triangle" join queries on Hive 0.9.0, and we're wondering if we can get any better performance. Here's the query we're running: SELECT count(*) FROM table r1 JOIN table r2 ON (r1.dst = r2.src) JOIN table r3 ON (r2.dst = r3.src AN

Re: Tuning Triangle Joins on Hive

2014-08-05 Thread Firas Abuzaid
Thanks, that's very helpful! On Sat, Aug 2, 2014 at 12:47 PM, Lefty Leverenz wrote: > How does indexes work in hive? >> > > See the Indexes design doc > in the Hive > wiki, although it hasn't been updated. > > -- Lefty > > > On Sat, Au

Re: Tuning Triangle Joins on Hive

2014-08-02 Thread Lefty Leverenz
> > How does indexes work in hive? > See the Indexes design doc in the Hive wiki, although it hasn't been updated. -- Lefty On Sat, Aug 2, 2014 at 2:07 AM, chandra Reddy Bogala < chandra.reddy2...@gmail.com> wrote: > How does indexes

Re: Tuning Triangle Joins on Hive

2014-08-01 Thread chandra Reddy Bogala
How does indexes work in hive? I thought file formats like ORC have indexes in each block. But not a separate index that can help query performance. Thanks, Chandra On Fri, Aug 1, 2014 at 9:10 AM, Devopam Mittra wrote: > Please try the following approach and let me know if you are not getting >

Re: Tuning Triangle Joins on Hive

2014-07-31 Thread Devopam Mittra
Please try the following approach and let me know if you are not getting better performance: 1. Ensure indexes are present on dst , rsc columns in the respective tables. 2. Create a subset first taking r2 and r2 (i.e.: r3.src > r2.src) in a physical table, and then create index on its new src colu

Tuning Triangle Joins on Hive

2014-07-31 Thread Firas Abuzaid
Hi, We're running various "triangle" join queries on Hive 0.9.0, and we're wondering if we can get any better performance. Here's the query we're running: SELECT count(*) FROM table r1 JOIN table r2 ON (r1.dst = r2.src) JOIN table r3 ON (r2.dst = r3.src AND r3.dst = r1.src) WHERE r1.src < r2.src