Re: Cross join/cartesian product explanation

2015-11-17 Thread Gopal Vijayaraghavan
>It¹s really a very simple query that I¹m trying to run: >select ... >bloom_contains(a_id, b_id_bloom) That's nearly impossible to optimize directly - there is no way to limit the number of table_b rows which may match table a. More than one bloom filter can successfully match a single row f

Re: Cross join/cartesian product explanation

2015-11-13 Thread Rory Sawyer
Hi Gopal, Thanks for the detailed response. It’s really a very simple query that I’m trying to run: select a.a_id, b.b_id, count(*) as c from table_a a, table_b b where bloom_contains(a_id, b_id_bloom) group by a.a_id, b.b_id; Where “bloom_contains” is a custom U

Re: Cross join/cartesian product explanation

2015-11-10 Thread Gopal Vijayaraghavan
>I¹m having trouble doing a cross join between two tables that are too big >for a map-side join. The actual query would help btw. Usually what is planned as a cross-join can be optimized out into a binning query with a custom UDF. In particular with 2-D geo queries with binning, which people ten

Re: Cross join/cartesian product explanation

2015-11-09 Thread Rory Sawyer
Hi Gopal, Thanks for the speedy response! A follow-up question though: 10Mb input sounds like that would work for a map join. I’m having trouble doing a cross join between two tables that are too big for a map-side join. Trying to break down one table into small enough partitions and then union

Re: Cross join/cartesian product explanation

2015-11-06 Thread Gopal Vijayaraghavan
> Over the last few week I¹ve been trying to use cross joins/cartesian >products and was wondering why, exactly, this all gets sent to one >reducer. All I¹ve heard or read is that Hive can¹t/doesn¹t parallelize >the job. The hashcode of the shuffle key is 0, since you need to process every row a

Re: cross join

2012-12-03 Thread Mark Grover
Periya, I am using Hive-0.8.1 and was able to get joins to work with: 1. No on clause 2. "on (true)" as the on clause to work. I am not entirely sure why you are getting exceptions with the above queries. Perhaps, there were some bugs in 0.7.1 that got resolved in 0.8.1? As far as workarounds go,

Re: Cross join in Hive.

2011-05-02 Thread Raghunath, Ranjith
Very cool. What is the non strict option for? Thanks, Ranjith From: Ashish Thusoo To: Sent: Mon May 02 16:00:39 2011 Subject: Re: Cross join in Hive. you could probably just say (1 = 1) in the on clause for the join. set hive.mapred.mode=nonstrict; select

Re: Cross join in Hive.

2011-05-02 Thread Ashish Thusoo
first table to be equal to the same column in the other table. Thanks, Ranjith From: Raghunath, Ranjith mailto:ranjith.raghuna...@usaa.com>> To: 'user@hive.apache.org' mailto:user@hive.apache.org>> Sent: Mon May 02 00:21:57 2011 Subject: Re:

Re: Cross join in Hive.

2011-05-01 Thread Raghunath, Ranjith
11 Subject: Re: Cross join in Hive. I haven't tested this out but plan to in 6 hours. Add an extra column and set it to 1 in both tables. Perform an inner join between the two tables. Thanks, Ranjith From: Abhinov Agarwal To: user@hive.apache.org Sent: S

Re: Cross join in Hive.

2011-05-01 Thread Raghunath, Ranjith
I haven't tested this out but plan to in 6 hours. Add an extra column and set it to 1 in both tables. Perform an inner join between the two tables. Thanks, Ranjith From: Abhinov Agarwal To: user@hive.apache.org Sent: Sun May 01 22:50:34 2011 Subject: Cross join