You can always write something simple to hand call the HashPartitioner.
Jython works for quick tests.
But the code in  hash partitioner is essentially ((int) key.hashcode()) %
num reduces.
Since nothing else is in play, I suspect there is an incorrect assumption
somewhere.


On Fri, Jun 12, 2009 at 11:25 AM, Zhengguo 'Mike' SUN <[email protected]
> wrote:

> Hi,
>
> The intermediate key generated by my Mappers is IntWritable. I tested with
> different number of Reducers. When the number of Reducers is the same as the
> number of different keys of intermediate output. It partitions perfectly.
> Each Reducer receives one input group. When these two numbers are different,
> the partitioning function becomes difficult to understand. For example, when
> the number of keys is less than the number of Reducers, I am expecting that
> each Reducer at most receive one input group. But it turns out that many
> Reducers receive more than one input group. On the other hand, when the
> number of keys is larger than the number of Reducers, I am expecting that
> each Reducer at least receive one input group. But it turns out that some
> Reducers receive nothing to process. The expectation I had is from the
> implementation of HashPartitioner class, which just uses modulo operator
> with the number of Reducers to generate partitions.
>
> Anyone has any insights into this?
>
>
>
>




-- 
Pro Hadoop, a book to guide you from beginner to hadoop mastery,
http://www.apress.com/book/view/9781430219422
www.prohadoopbook.com a community for Hadoop Professionals

Reply via email to