On Aug 23, 2011, at 2:25 AM, Peter Schuller wrote:

>> We've been having issues where as soon as we start doing heavy writes (via 
>> hadoop) recently, it really hammers 4 nodes out of 20.  We're using random 
>> partitioner and we've set the initial tokens for our 20 nodes according to 
>> the general spacing formula, except for a few token offsets as we've 
>> replaced dead nodes.
> 
> Is the hadoop job iterating over keys in the cluster in token order
> perhaps, and you're generating writes to those keys? That would
> explain a "moving hotspot" along the cluster.

Yes - we're iterating over all the keys of particular column families, doing 
joins using pig as we enrich and perform measure calculations.  When we write, 
we're usually writing out for a certain small subset of keys which shouldn't 
have hotspots with RandomPartitioner afaict.

> 
> -- 
> / Peter Schuller (@scode on twitter)

Reply via email to