On Tue, Aug 20, 2013 at 11:35 PM, Keith Wright <kwri...@nanigans.com> wrote:

> Still looking for help!  We have stopped almost ALL traffic to the cluster
> and still some nodes are showing almost 1000% CPU for cassandra with no
> iostat activity.   We were running cleanup on one of the nodes that was not
> showing load spikes however now when I attempt to stop cleanup there via
> nodetool stop cleanup the java task for stopping cleanup itself is at 1500%
> and has not returned after 2 minutes.  This is VERY odd behavior.  Any
> ideas?  Hardware failure?  Network?  We are not seeing anything there but
> wanted to get ideas.
>

The most obvious answer is that somehow the problem nodes hit a magical
threshold which makes them "thrash" with GC.

If you restart the affected nodes, does the error  condition return? If so,
how quickly?

=Rob

Reply via email to