Re: Nodes dropping out of cluster due to GC

Peter Schüller Thu, 03 Jun 2010 03:46:31 -0700

> We did indeed have a problem with our GC settings.  The survivor ratio was
> too low.  After changing that things are better but we are still seeing GC
> that takes 5-10 seconds, which is enough for the node to drop out of the
> cluster briefly.


This still indicates full GC:s. What is your write activity like? Do
you know if you're legitimately growing the heap quickly enough that
the concurrent marking in CMS is unable to catch up? What is the free
heap ratio (according to the logs produced with
-XX:+PrintGC/-XX:+PrintGCDetails) after a concurrent mark-sweep has
finished?

If the heap is very full even after a mark/sweep you likely need a
bigger heep or smaller caches sizes/memtables flush thresholds etc.

On the other hand if you have very significant amounts of free space
in the heap after a mark/sweep, the problem may rather be that CMS is
just kicking in too late. If so you can experiment with the
-XX:+UseCMSInitiatingOccupancyOnly and
-XX:CMSInitiatingOccupancyFraction=XXX options. If you're willing to
temporarily accept that CMS is continuously running (due to an
aggressive initiating occupancy fraction) that should at least tell
you whether you can in fact avoid the fallbacks and if so, then look
at more proper tuning...

-- 
/ Peter Schuller aka scode

Re: Nodes dropping out of cluster due to GC

Reply via email to