Re: Cross join/cartesian product explanation

Rory Sawyer Fri, 13 Nov 2015 07:19:12 -0800

Hi Gopal,

Thanks for the detailed response.


It’s really a very simple query that I’m trying to run:
select
    a.a_id,
    b.b_id,
    count(*) as c
from
    table_a a, 
    table_b b
where
    bloom_contains(a_id, b_id_bloom)
group by
    a.a_id,
    b.b_id;

Where “bloom_contains” is a custom UDF. The only changes I made were renaming 
the tables and columns. The sizes of the tables I’m running against are small — 
roughly 50-100Mb — but this query would need to be expanded to run on a table 
that is >100Gb (table_b would likely max out around 100Mb).

Any suggestions on how to approach this would be greatly appreciated.

Best,
Rory

Re: Cross join/cartesian product explanation

Reply via email to