I have Cassandra 3.0.9 cluster that is hitting OutOfMemoryErrors with byte buffer allocation. The stack trace looks like:
java.lang.OutOfMemoryError: Direct buffer memory at java.nio.Bits.reserveMemory(Bits.java:694) ~[na:1.8.0_131] at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:123) ~[na:1.8.0_131] at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:311) ~[na:1.8.0_131] at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:434) ~[netty-all-4.0.23.Final.jar:4.0.23.Final] at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:179) ~[netty-all-4.0.23.Final.jar:4.0.23.Final] at io.netty.buffer.PoolArena.allocate(PoolArena.java:168) ~[netty-all-4.0.23.Final.jar:4.0.23.Final] at io.netty.buffer.PoolArena.allocate(PoolArena.java:98) ~[netty-all-4.0.23.Final.jar:4.0.23.Final] at io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:250) ~[netty-all-4.0.23.Final.jar:4.0.23.Final] at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:155) ~[netty-all-4.0.23.Final.jar:4.0.23.Final] at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:146) ~[netty-all-4.0.23.Final.jar:4.0.23.Final] at io.netty.buffer.AbstractByteBufAllocator.buffer(AbstractByteBufAllocator.java:83) ~[netty-all-4.0.23.Final.jar:4.0.23.Final] at io.netty.handler.ssl.SslHandler.allocate(SslHandler.java:1265) ~[netty-all-4.0.23.Final.jar:4.0.23.Final] at io.netty.handler.ssl.SslHandler.allocateOutNetBuf(SslHandler.java:1275) ~[netty-all-4.0.23.Final.jar:4.0.23.Final] at io.netty.handler.ssl.SslHandler.wrap(SslHandler.java:453) ~[netty-all-4.0.23.Final.jar:4.0.23.Final] at io.netty.handler.ssl.SslHandler.flush(SslHandler.java:432) ~[netty-all-4.0.23.Final.jar:4.0.23.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:688) ~[netty-all-4.0.23.Final.jar:4.0.23.Final] I do not yet have a heap dump. The two relevant tickets are CASSANDRA-13114 <https://issues.apache.org/jira/browse/CASSANDRA-13114> and CASSANDRA-13126 <https://issues.apache.org/jira/browse/CASSANDRA-13126>. The upstream Netty ticket is 3057 <https://github.com/netty/netty/issues/3057>. Cassandra 3.0.11 upgraded Netty to the version with the fix. Is there anything I can check to confirm that this is in fact the issue I am hitting? Secondly, is there a way to monitor for this? The OOME does not cause the JVM to exit. Instead, the logs are getting filled up with OutOfMemoryErrors. nodetool status reports UN, and nodetool statusbinary reports running. -- - John