Thanks for your quick response! I am currently running the performance tests with extended gc logging. I will post the gc logging if clients time out at the same moment that the full garbage collect runs.
Thanks Rene -----Original Message----- From: sc...@scode.org [mailto:sc...@scode.org] On Behalf Of Peter Schuller Sent: dinsdag 20 december 2011 6:36 To: user@cassandra.apache.org Subject: Re: Garbage collection freezes cassandra node > During the garbage collections, Cassandra freezes for about ten seconds. I > observe the following log entries: > > > > “GC for ConcurrentMarkSweep: 11597 ms for 1 collections, 1887933144 used; max > is 8550678528” Ok, first off: Are you certain that it is actually pausing, or are you assuming that due to the log entry above? Because the log entry in no way indicates a 10 second pause; it only indicates that CMS took 10 seconds - which is entirely expected, and most of CMS is concurrent and implies only short pauses. A full pause can happen, but that log entry is expected and is not in and of itself indicative of a stop-the-world 10 second pause. It is fully expected using the CMS collector that you'll have a sawtooth pattern as young gen is being collected, and then a sudden drop as CMS does its job concurrently without pausing the application for a long period of time. I will second the recommendation to run with -XX:+DisableExplicitGC (or -XX:+ExplicitGCInvokesConcurrent) to eliminate that as a source. I would also run with -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps and report back the results (i.e., the GC log around the time of the pause). Your graph is looking very unusual for CMS. It's possible that everything is as it otherwise should and CMS is kicking in too late, but I am kind of skeptical towards that even the extremely smooth look of your graph. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)