Hello, @kurt greaves: Have you tried CMS with that sized heap?
Yes, for testing for testing purposes, I have 3 nodes with CMS and 3 with G1. The behavior is basically the same. *Using CMS suggested settings* http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTcvMTAvOC8tLWdjLmxvZy4wLmN1cnJlbnQtLTE5LTAtNDk= *Using G1 suggested settings* http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTcvMTAvOC8tLWdjLmxvZy4wLmN1cnJlbnQtLTE5LTExLTE3 @Steinmaurer, Thomas If this happens in a very short very frequently and > depending on your allocation rate in MB/s, a combination of the G1 bug and > a small heap, might result going towards OOM. We have a really high obj allocation rate: Avg creation rate 622.9 mb/sec Avg promotion rate 18.39 mb/sec It could be the cause, where the GC can't keep up with this rate. I'm stating to think it could be some wrong configuration where Cassandra is configured in a way that bursts allocations in a manner that G1 can't keep up with. Any ideas? Best regards, 2017-10-09 12:44 GMT+01:00 Steinmaurer, Thomas < thomas.steinmau...@dynatrace.com>: > Hi, > > > > although not happening here with Cassandra (due to using CMS), we had some > weird problem with our server application e.g. hit by the following JVM/G1 > bugs: > > https://bugs.openjdk.java.net/browse/JDK-8140597 > > https://bugs.openjdk.java.net/browse/JDK-8141402 (more or less a > duplicate of above) > > https://bugs.openjdk.java.net/browse/JDK-8048556 > > > > Especially the first, JDK-8140597, might be interesting, if you see > periodic humongous allocations (according to a GC log) resulting in mixed > GC phases being steadily interrupted due to G1 bug, thus no GC in OLD > regions. Humongous allocations will happen if a single (?) allocation is > > (region size / 2), if I remember correctly. Can’t recall the default G1 > region size for a 12GB heap, but possibly 4MB. So, in case you are > allocating something larger than > 2MB, you might end up in something > called “humongous” allocations, spanning several G1 regions. If this > happens in a very short very frequently and depending on your allocation > rate in MB/s, a combination of the G1 bug and a small heap, might result > going towards OOM. > > > > Possibly worth a further route for investigation. > > > > Regards, > > Thomas > > > > *From:* Gustavo Scudeler [mailto:scudel...@gmail.com] > *Sent:* Montag, 09. Oktober 2017 13:12 > *To:* user@cassandra.apache.org > *Subject:* Cassandra and G1 Garbage collector stop the world event (STW) > > > > Hi guys, > > We have a 6 node Cassandra Cluster under heavy utilization. We have been > dealing a lot with garbage collector stop the world event, which can take > up to 50 seconds in our nodes, in the meantime Cassandra Node is > unresponsive, not even accepting new logins. > > Extra details: > > · Cassandra Version: 3.11 > > · Heap Size = 12 GB > > · We are using G1 Garbage Collector with default settings > > · Nodes size: 4 CPUs 28 GB RAM > > · All CPU cores are at 100% all the time. > > · The G1 GC behavior is the same across all nodes. > > The behavior remains basically: > > 1. Old Gen starts to fill up. > > 2. GC can't clean it properly without a full GC and a STW event. > > 3. The full GC starts to take longer, until the node is completely > unresponsive. > > *Extra details and GC reports:* > > https://stackoverflow.com/questions/46568777/cassandra- > and-g1-garbage-collector-stop-the-world-event-stw > > > > Can someone point me what configurations or events I could check? > > > > Thanks! > > > > Best regards, > > > The contents of this e-mail are intended for the named addressee only. It > contains information that may be confidential. Unless you are the named > addressee or an authorized designee, you may not copy or use it, or > disclose it to anyone else. If you received it in error please notify us > immediately and then destroy it. Dynatrace Austria GmbH (registration > number FN 91482h) is a company registered in Linz whose registered office > is at 4040 Linz, Austria, Freistädterstraße 313 > <https://maps.google.com/?q=4040+Linz,+Austria,+Freist%C3%A4dterstra%C3%9Fe+313&entry=gmail&source=g> >