Hi Jan, Glad you found something workable. What version of Hive are you using? Could you also please check what the value of the property hive.optimize.ppd is for you?
Thanks, Mark ----- Original Message ----- From: "Jan Dolinár" <dolik....@gmail.com> To: user@hive.apache.org Sent: Tuesday, May 29, 2012 1:57:25 AM Subject: Re: Multi-group-by select always scans entire table On Fri, May 25, 2012 at 12:03 PM, Jan Dolinár < dolik....@gmail.com > wrote: -- see what happens when you try to perform multi-group-by query on one of the partitions EXPLAIN EXTENDED FROM partition_test LATERAL VIEW explode(col1) tmp AS exp_col1 INSERT OVERWRITE DIRECTORY '/test/1' SELECT exp_col1 WHERE (part_col=2) INSERT OVERWRITE DIRECTORY '/test/2' SELECT exp_col1 WHERE (part_col=2); -- result: it wants to scan all partitions :-( Since nobody else did, let me answer myself... In the end I found out that the correct partition pruning can be achieved using subquery. Continuing the example from my last post, the query would be: FROM ( SELECT * FROM partition_test LATERAL VIEW explode(col1) tmp AS exp_col1 WHERE part_col=2 ) t INSERT OVERWRITE DIRECTORY '/test/1' SELECT exp_col1 INSERT OVERWRITE DIRECTORY '/test/2' SELECT exp_col1; I still think the pruning should work correctly no matter how the query is written, but for now I'm happy with this solution. J. Dolinar