RE: Bucket pruning

2015-03-23 Thread Mich Talebzadeh
of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility. From: matshyeq [mailto:matsh...@gmail.com] Sent: 23 March 2015 10:41 To: user Cc: Daniel Haviv Subject: Re: Bucket pruning To me

Re: Bucket pruning

2015-03-23 Thread matshyeq
To me there's practically very little difference between partitioning and bucketing (partitioning defines split criteria explicitly whereas bucketing somewhat implicitly) . Hive however recognises the latter as a separate feature and handles the two in quite different way. There's already a featur

Re: Bucket pruning

2015-03-13 Thread cobby
hi, thanks for the detailed response. i will experiment with your suggested orc bloom filter solution. it seems to me the obvious, most straight forward solution is to add support for hash partitioning. so i can do something like: create table T() partitioned by (x into num_partitions,..). upon

Re: Bucket pruning

2015-03-12 Thread Gopal Vijayaraghavan
Hi, No and it¹s a shame because we¹re stuck on some compatibility details with this. The primary issue is the fact that the InputFormat is very generic and offers no way to communicate StorageDescriptor or bucketing. The split generation for something SequenceFileInputFormat lives inside MapRedu