2011/3/8 Peter Schuller <peter.schul...@infidyne.com> > > (1) I cannot stress this one enough: Run with -XX:+PrintGC > -XX:+PrintGCDetails -XX:+PrintGCTimeStamps and collect the output. > (2) Attach to your process with jconsole or some similar tool. > (3) Observe the behavior of the heap over time. Preferably post > screenshots so others can look at them. > > > I'm not sure that up to the end you has understood, sorry
I launch cassandra with follow gc login options (but doesn't mention about this before, because of this document http://www.datastax.com/docs/0.7/troubleshooting/index#nodes-seem-to-freeze-after-some-period-of-time, there is no any mention about gc.log ): JVM_OPTS="$JVM_OPTS -XX:+PrintGCApplicationStoppedTime" JVM_OPTS="$JVM_OPTS -Xloggc:/var/log/cassandra/gc.log" And detect that nodes frozen with follow log entires Total time for which application threads were stopped: 30.0000957 seconds And so on. Also when i think that nodes are frozen i got UnavailableException and TimeOutException, about 20-30 times (i make few Attempts (300 with 1 sec sleep) before final fail), follow fragment of code illustrate what i do for(; $l_i < 300; ++$l_i) { try { $client->batch_mutate($mutations, cassandra_ConsistencyLevel::QUORUM); $retval = true; break; } catch(cassandra_UnavailableException $e) { array_push($l_exceptions, get_class($e)); sleep(1); } catch(cassandra_TimedOutException $e) { array_push($l_exceptions, get_class($e)); sleep(1); } catch(Exception $e) { $loger->err(get_class($e).': '.$e->getMessage()); $loger->err($mutations); break; }; };