You can always write something simple to hand call the HashPartitioner. Jython works for quick tests. But the code in hash partitioner is essentially ((int) key.hashcode()) % num reduces. Since nothing else is in play, I suspect there is an incorrect assumption somewhere.
On Fri, Jun 12, 2009 at 11:25 AM, Zhengguo 'Mike' SUN <[email protected] > wrote: > Hi, > > The intermediate key generated by my Mappers is IntWritable. I tested with > different number of Reducers. When the number of Reducers is the same as the > number of different keys of intermediate output. It partitions perfectly. > Each Reducer receives one input group. When these two numbers are different, > the partitioning function becomes difficult to understand. For example, when > the number of keys is less than the number of Reducers, I am expecting that > each Reducer at most receive one input group. But it turns out that many > Reducers receive more than one input group. On the other hand, when the > number of keys is larger than the number of Reducers, I am expecting that > each Reducer at least receive one input group. But it turns out that some > Reducers receive nothing to process. The expectation I had is from the > implementation of HashPartitioner class, which just uses modulo operator > with the number of Reducers to generate partitions. > > Anyone has any insights into this? > > > > -- Pro Hadoop, a book to guide you from beginner to hadoop mastery, http://www.apress.com/book/view/9781430219422 www.prohadoopbook.com a community for Hadoop Professionals
