Hash partitioning (Bucketing) does not make much sense for YYYY/MM/DD/32 as pointed out.
So it is clear that with (mod 32), the maximum number of offsets is going to be 32, i.e. in the range between 0-31. With YYYY/MM/DD you have to account for hash collisions as well. The set of inputs is potentially many (definitely not known until we encounter them all) and if you want to spread them evenly (after all that is what hash partitioning is all about) then I think day of the month makes more sense. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On 19 August 2016 at 23:15, Gopal Vijayaraghavan <gop...@apache.org> wrote: > > > We are bucketing by date so we wil have max 32 buckets > > If you do want to lookup specifically by date, you could just create day > partitions and never partition by month. > > FYI, in a modern version of Hive > > select count(1) from table where YEAR(dt) = 2016 and MONTH(dt) = 12 > > does prune it on the client side. > > On a different note, 31 buckets is a bad idea (32 is ok), because for > String hashes (32-1) is the magic number which hurts "yyyymmdd" and 50% of > your buckets have 0 data. > > http://www.slideshare.net/t3rmin4t0r/data-organization-hive-meetup/6 > > > Use that as a number and you'll get the same number back as the hashcode, > so it won't be stable as months change (20160816 % 32 == 16 and 20160716 % > 32 == 12). > > The only way to have buckets correspond to a day_of_month as an int and > bucket on it with 32 - then bucket0 == 31, bucket1=1, bucket2=2 etc. > > Cheers, > Gopal > > >