I am facing issues with a query where I am joining two fairly large tables on the partitioned column along with other common columns. The expected output is not in line with what I expect it to be. Since the query is very complex, I will simplify it so that people can provide inputs if they have faced similar issues or if I am doing something totally wrong.
TABLE A: a_id bigint common_id bigint some_string string total_count bigint part_col string <---- this is the partitioned column TABLE B: b_int bigint common_id bigint some_string string total_sum bigint part_col string <---- this is the partitioned column now the query is as follows: SELECT /*+ STREAMTABLE(A,B) */ A.some_string, B.some_string, sum(A.total_count), sum(B.total_sum) from A JOIN B ON (t1.part_col = t2.part_col AND t1.common_id = t2.common_id) WHERE t1.part_col >= 'val1' AND t2.part_col >= 'val1' GROUP BY A.some_string, B.some_string Does HIVE not like to join on the partitioned columns ? because when i create a join on just the partitioned column the reduce step never finishes. I am using HIVE 0.5.0 Thanks, Viral