Hi John, the other main source of STW pause in the JVM is the safepoint mechanism : http://blog.ragozin.info/2012/10/safepoints-in-hotspot-jvm.html
If you turn on full GC logging in your cassandra-env.sh file, you will find lines like this : 2017-10-09T20:13:42.462+0000: 4.890: Total time for which application threads were stopped: 0.0003137 seconds, Stopping threads took: 0.0001163 seconds 2017-10-09T20:13:42.472+0000: 4.899: Total time for which application threads were stopped: 0.0001622 seconds, Stopping threads took: 0.0000361 seconds 2017-10-09T20:13:46.162+0000: 8.590: Total time for which application threads were stopped: 2.6899536 seconds, Stopping threads took: 2.6899004 seconds 2017-10-09T20:13:46.162+0000: 8.590: Total time for which application threads were stopped: 0.0002418 seconds, Stopping threads took: 0.0000456 seconds 2017-10-09T20:13:46.461+0000: 8.889: Total time for which application threads were stopped: 0.0002654 seconds, Stopping threads took: 0.0000397 seconds 2017-10-09T20:13:46.478+0000: 8.906: Total time for which application threads were stopped: 0.0001646 seconds, Stopping threads took: 0.0000791 seconds These aren't GCs but still you can see that we have a 2.6s pause, with most of the time spent waiting for threads to reach the safepoint. When we saw this in the past, it was due to faulty disks that were preventing the read threads from reaching the safepoint. If you want to specifically identify the threads that were stuck, you can set a timeout on the safepoints : # GC logging options JVM_OPTS="$JVM_OPTS -XX:+PrintGCDetails" JVM_OPTS="$JVM_OPTS -XX:+PrintGCDateStamps" JVM_OPTS="$JVM_OPTS -XX:+PrintHeapAtGC" JVM_OPTS="$JVM_OPTS -XX:+PrintTenuringDistribution" JVM_OPTS="$JVM_OPTS -XX:+PrintGCApplicationStoppedTime" JVM_OPTS="$JVM_OPTS -XX:+PrintPromotionFailure" JVM_OPTS="$JVM_OPTS -XX:+PrintSafepointStatistics" JVM_OPTS="$JVM_OPTS -XX:+UnlockDiagnosticVMOptions -XX:+LogVMOutput -XX:LogFile=/var/log/cassandra/vm.log" JVM_OPTS="$JVM_OPTS -XX:+SafepointTimeout -XX:SafepointTimeoutDelay=5000" Check the duration of the pauses you're seeing on your nodes and set a shorter timeout (it should be fairly fast to reach safepoint). Above it is set at 5s. Restart your Cassandra process with the above settings, and wait for one pause to happen. Then stop Cassandra and it will output informations in the /var/log/cassandra/vm.log file (that only happens when the process stops, nothing gets written there before that). If indeed some threads were preventing the safepoint, they'll get listed there. Let us know how it goes. Cheers, On Fri, Oct 20, 2017 at 5:11 AM John Sanda <john.sa...@gmail.com> wrote: > I have a small, two-node cluster running Cassandra 2.2.1. I am seeing a > lot of these messages in both logs: > > WARN 07:23:16 Not marking nodes down due to local pause of 7219277694 > > 5000000000 > > I am fairly certain that they are not due to GC. I am not seeing a whole > of GC being logged and nothing over 500 ms. I do think it is I/O related. > > I am seeing lots of read timeouts for queries to a table that has a large > growing number of SSTables. At last count there are over 1800 SSTables on > one node. The count is lower on the other node, and I suspect that this is > due to data distribution. Slowly but surely the number of SSTables keeps > going up, and not surprisingly nodetool tablehistograms reports high > latencies. The table is using STCS. > > I am seeing some but not a whole lot of dropped mutations. nodetool > tpstats looks ok. > > The growing number of SSTables really makes me think this is an I/O issue. > Casssandra is running in a kubernetes cluster using a SAN which is another > reason I suspect I/O. > > What are some things I can look at/test to determine what is causing all > of local pauses? > > > - John > -- ----------------- Alexander Dejanovski France @alexanderdeja Consultant Apache Cassandra Consulting http://www.thelastpickle.com