Compaction falling behind will likely cause additional work on reads (more sstables to merge), but I’d be surprised if it manifested in super long GC. When you say twice as many sstables, how many is that?.
In cfstats, does anything stand out? Is max row size on those nodes larger than on other nodes? What you don’t show in your JVM options is the new gen size – if you do have unusually large partitions on those two nodes (especially likely if you have rf=2 – if you have rf=3, then there’s probably a third node misbehaving you haven’t found yet), then raising new gen size can help handle the garbage created by reading large partitions without having to tolerate the promotion. Estimates for the amount of garbage vary, but it could be “gigabytes” of garbage on a very wide partition (see https://issues.apache.org/jira/browse/CASSANDRA-9754 for work in progress to help mitigate that type of pain). - Jeff From: Anishek Agarwal Reply-To: "[email protected]" Date: Tuesday, March 1, 2016 at 11:12 PM To: "[email protected]" Subject: Lot of GC on two nodes out of 7 Hello, we have a cassandra cluster of 7 nodes, all of them have the same JVM GC configurations, all our writes / reads use the TokenAware Policy wrapping a DCAware policy. All nodes are part of same Datacenter. We are seeing that two nodes are having high GC collection times. Then mostly seem to spend time in GC like about 300-600 ms. This also seems to result in higher CPU utilisation on these machines. Other 5 nodes don't have this problem. There is no additional repair activity going on the cluster, we are not sure why this is happening. we checked cfhistograms on the two CF we have in the cluster and number of reads seems to be almost same. we also used cfstats to see the number of ssttables on each node and one of the nodes with the above problem has twice the number of ssttables than other nodes. This still doesnot explain why two nodes have high GC Overheads. our GC config is as below: JVM_OPTS="$JVM_OPTS -XX:+UseParNewGC" JVM_OPTS="$JVM_OPTS -XX:+UseConcMarkSweepGC" JVM_OPTS="$JVM_OPTS -XX:+CMSParallelRemarkEnabled" JVM_OPTS="$JVM_OPTS -XX:SurvivorRatio=8" JVM_OPTS="$JVM_OPTS -XX:MaxTenuringThreshold=50" JVM_OPTS="$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=70" JVM_OPTS="$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly" JVM_OPTS="$JVM_OPTS -XX:+UseTLAB" JVM_OPTS="$JVM_OPTS -XX:MaxPermSize=256m" JVM_OPTS="$JVM_OPTS -XX:+AggressiveOpts" JVM_OPTS="$JVM_OPTS -XX:+UseCompressedOops" JVM_OPTS="$JVM_OPTS -XX:+CMSScavengeBeforeRemark" JVM_OPTS="$JVM_OPTS -XX:ConcGCThreads=48" JVM_OPTS="$JVM_OPTS -XX:ParallelGCThreads=48" JVM_OPTS="$JVM_OPTS -XX:-ExplicitGCInvokesConcurrent" JVM_OPTS="$JVM_OPTS -XX:+UnlockDiagnosticVMOptions" JVM_OPTS="$JVM_OPTS -XX:+UseGCTaskAffinity" JVM_OPTS="$JVM_OPTS -XX:+BindGCTaskThreadsToCPUs" # earlier value 131072 = 32768 * 4 JVM_OPTS="$JVM_OPTS -XX:ParGCCardsPerStrideChunk=131072" JVM_OPTS="$JVM_OPTS -XX:CMSScheduleRemarkEdenSizeThreshold=104857600" JVM_OPTS="$JVM_OPTS -XX:CMSRescanMultiple=32768" JVM_OPTS="$JVM_OPTS -XX:CMSConcMarkMultiple=32768" #new JVM_OPTS="$JVM_OPTS -XX:+CMSConcurrentMTEnabled" We are using cassandra 2.0.17. If anyone has any suggestion as to how what else we can look for to understand why this is happening please do reply. Thanks anishek
smime.p7s
Description: S/MIME cryptographic signature
