Can you please provide us more details:
Number of rows in each table and per partition, the table structure, hive 
version, table format, is table sorted or partitioned on dt?

Why don’t you use a join, potentially with a mapjoin hint?

> Am 19.12.2018 um 09:02 schrieb Prabhakar Reddy <prabha.cl...@gmail.com>:
> 
> Hello,
> 
> I have a table large_table with more than 50K partitions and when I run below 
> query it is running for ever.The other table small_table2 has only five 
> partitions and when ever I run below query it seems to be scanning all 
> partitions rather than scanning only five partitions which are there in 
> smaller table.
> 
> select * from large_table a  where a.dt in (select dt from small_table2) 
> limit 5;
> 
> Could you please confirm if this is the expected behavior or any way we can 
> tune this query to fetch results faster?
> 
> Regards
> Prabhakar Reddy

Reply via email to