We have a 60 node CS cluster running 2.2.7 and about 20GB of RAM allocated
to each C* node.  We're aware of the recommended 8GB limit to keep GCs low
but our memory has been creeping up (probably) related to this bug.

Here's what we're seeing... if we do a low level of writes we think
everything generally looks good.

What happens is that we then need to catch up and then do a TON of writes
all in a small time window.  Then CS nodes start dropping like flies.  Some
of them just GC frequently and are able to recover. When they GC like this
we see GC pause in the 30 second range which then cause them to not gossip
for a while and they drop out of the cluster.

This happens as a flurry around the cluster so we're not always able to
catch which ones are doing it as they recover. However, if we have 3 down,
we mostly have a locked up cluster.  Writes don't complete and our app
essentially locks up.

SOME of the boxes never recover. I'm in this state now.  We have t3-5 nodes
that are in GC storms which they won't recover from.

I reconfigured the GC settings to enable jstat.

I was able to catch it while it was happening:

^Croot@util0067 ~ # sudo -u cassandra jstat -gcutil 4235 2500
  S0     S1     E      O      M     CCS    YGC     YGCT    FGC    FGCT
GCT
  0.00 100.00 100.00  94.76  97.60  93.06  10435 1686.191   471 1139.142
2825.332
  0.00 100.00 100.00  94.76  97.60  93.06  10435 1686.191   471 1139.142
2825.332
  0.00 100.00 100.00  94.76  97.60  93.06  10435 1686.191   471 1139.142
2825.332
  0.00 100.00 100.00  94.76  97.60  93.06  10435 1686.191   471 1139.142
2825.332
  0.00 100.00 100.00  94.76  97.60  93.06  10435 1686.191   471 1139.142
2825.332
  0.00 100.00 100.00  94.76  97.60  93.06  10435 1686.191   471 1139.142
2825.332

... as you can see the box is legitimately out of memory.  S0, S1, E and O
are all completely full.

I'm not sure were to go from here.  I think 20GB for our work load is more
than reasonable.

90% of the time they're well below 10GB of RAM used.  While I was watching
this box I was seeing 30% RAM used until it decided to climb to 100%

Any advice on what do do next... I don't see anything obvious in the logs
to signal a problem.

I attached all the command line arguments we use.  Note that I think that
the cassandra-env.sh script puts them in there twice.

-ea
-javaagent:/usr/share/cassandra/lib/jamm-0.3.0.jar
-XX:+CMSClassUnloadingEnabled
-XX:+UseThreadPriorities
-XX:ThreadPriorityPolicy=42
-Xms20000M
-Xmx20000M
-Xmn4096M
-XX:+HeapDumpOnOutOfMemoryError
-Xss256k
-XX:StringTableSize=1000003
-XX:+UseParNewGC
-XX:+UseConcMarkSweepGC
-XX:+CMSParallelRemarkEnabled
-XX:SurvivorRatio=8
-XX:MaxTenuringThreshold=1
-XX:CMSInitiatingOccupancyFraction=75
-XX:+UseCMSInitiatingOccupancyOnly
-XX:+UseTLAB
-XX:CompileCommandFile=/hotspot_compiler
-XX:CMSWaitDuration=10000
-XX:+CMSParallelInitialMarkEnabled
-XX:+CMSEdenChunksRecordAlways
-XX:CMSWaitDuration=10000
-XX:+UseCondCardMark
-XX:+PrintGCDetails
-XX:+PrintGCDateStamps
-XX:+PrintHeapAtGC
-XX:+PrintTenuringDistribution
-XX:+PrintGCApplicationStoppedTime
-XX:+PrintPromotionFailure
-XX:PrintFLSStatistics=1
-Xloggc:/var/log/cassandra/gc.log
-XX:+UseGCLogFileRotation
-XX:NumberOfGCLogFiles=10
-XX:GCLogFileSize=10M
-Djava.net.preferIPv4Stack=true
-Dcom.sun.management.jmxremote.port=7199
-Dcom.sun.management.jmxremote.rmi.port=7199
-Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote.authenticate=false
-Djava.library.path=/usr/share/cassandra/lib/sigar-bin
-XX:+UnlockCommercialFeatures
-XX:+FlightRecorder
-ea
-javaagent:/usr/share/cassandra/lib/jamm-0.3.0.jar
-XX:+CMSClassUnloadingEnabled
-XX:+UseThreadPriorities
-XX:ThreadPriorityPolicy=42
-Xms20000M
-Xmx20000M
-Xmn4096M
-XX:+HeapDumpOnOutOfMemoryError
-Xss256k
-XX:StringTableSize=1000003
-XX:+UseParNewGC
-XX:+UseConcMarkSweepGC
-XX:+CMSParallelRemarkEnabled
-XX:SurvivorRatio=8
-XX:MaxTenuringThreshold=1
-XX:CMSInitiatingOccupancyFraction=75
-XX:+UseCMSInitiatingOccupancyOnly
-XX:+UseTLAB
-XX:CompileCommandFile=/etc/cassandra/hotspot_compiler
-XX:CMSWaitDuration=10000
-XX:+CMSParallelInitialMarkEnabled
-XX:+CMSEdenChunksRecordAlways
-XX:CMSWaitDuration=10000
-XX:+UseCondCardMark
-XX:+PrintGCDetails
-XX:+PrintGCDateStamps
-XX:+PrintHeapAtGC
-XX:+PrintTenuringDistribution
-XX:+PrintGCApplicationStoppedTime
-XX:+PrintPromotionFailure
-XX:PrintFLSStatistics=1
-Xloggc:/var/log/cassandra/gc.log
-XX:+UseGCLogFileRotation
-XX:NumberOfGCLogFiles=10
-XX:GCLogFileSize=10M
-Djava.net.preferIPv4Stack=true
-Dcom.sun.management.jmxremote.port=7199
-Dcom.sun.management.jmxremote.rmi.port=7199
-Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote.authenticate=false
-Djava.library.path=/usr/share/cassandra/lib/sigar-bin
-XX:+UnlockCommercialFeatures
-XX:+FlightRecorder
-Dlogback.configurationFile=logback.xml
-Dcassandra.logdir=/var/log/cassandra
-Dcassandra.storagedir=
-Dcassandra-pidfile=/var/run/cassandra/cassandra.pid


-- 

We’re hiring if you know of any awesome Java Devops or Linux Operations
Engineers!

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
<https://plus.google.com/102718274791889610666/posts>

Reply via email to