Re: Out of memory and/or OOM kill on a cluster

Vladimir Yudovin Mon, 21 Nov 2016 03:02:11 -0800

Did you try any value in the range 8-20 (e.g. 60-70% of physical memory).

Also how many tables do you have across all keyspaces? Each table can consume 
minimum 1M of Java heap.




Best regards, Vladimir Yudovin, 

Winguzone - Hosted Cloud Cassandra
Launch your cluster in minutes.





---- On Mon, 21 Nov 2016 05:13:12 -0500Vincent Rischmann 
&lt;m...@vrischmann.me&gt; wrote ----




Hello,



we have a 8 node Cassandra 2.1.15 cluster at work which is giving us a lot of 
trouble lately.



The problem is simple: nodes regularly die because of an out of memory 
exception or the Linux OOM killer decides to kill the process.

For a couple of weeks now we increased the heap to 20Gb hoping it would solve 
the out of memory errors, but in fact it didn't; instead of getting out of 
memory exception the OOM killer killed the JVM.



We reduced the heap on some nodes to 8Gb to see if it would work better, but 
some nodes crashed again with out of memory exception.



I suspect some of our tables are badly modelled, which would cause Cassandra to 
allocate a lot of data, however I don't how to prove that and/or find which 
table is bad, and which query is responsible.



I tried looking at metrics in JMX, and tried profiling using mission control 
but it didn't really help; it's possible I missed it because I have no idea 
what to look for exactly.



Anyone have some advice for troubleshooting this ?



Thanks.

Re: Out of memory and/or OOM kill on a cluster

Reply via email to