Thanks again.
I think I figured out the bug (not sure if it's a bug or whether that's
a known limitation when creating a third-level join) we need another
table c to re-create my scenario.
table_a
create table table_a(a_id bigint, common_id bigint, int_a int, int_b int,
int_c int, int_d int,
EXPLAIN select
t1.some_string,t2.some_string,sum(t1.total_count),sum(t2.total_count) from
table_a t1 join table_b t2 on t1.part_col = t2.part_col and t1.common_id =
t2.common_id where t1.part_col >= 'mypart' and t2.part_col >= 'mypart' group by
t1.some_string,t2.some_string;
OK
ABSTRACT SYNTAX
Thanks Appan for verifying. I will do some more tests on my side too and let
you know the results.
I tried a different version of the query where I join'ed two sub-queries for
the same partitions and the data comes out to be correct.
I will see if I can post the real-world example to the list, be
Viral,
I tried the queries below (similar to yours) and I get the expected results
when I do the join. I ran my queries after building hive from the latest source
and hadoop 0.20+.
create table table_a(a_id bigint, common_id bigint, some_string
string,total_count bigint) partitioned by
Can you try this with a dummy table with very few rows ... to see if
the reason the script doesn't finish is a computational issue?
One other thing is to try with a combined partition, to see if it is a
problem with the partitioning.
Also, take a look at the results of an EXPLAIN statement, see
I haven't heard back from any on the list and am still struggling to join
two tables on partitioned column
Has anyone every tried joining two tables on a paritioned column and the
results are not as expected ?
On Tue, Jan 18, 2011 at 2:04 AM, Viral Bajaria wrote:
> I am facing issues with a query