Did you try any value in the range 8-20 (e.g. 60-70% of physical memory). Also how many tables do you have across all keyspaces? Each table can consume minimum 1M of Java heap.
Best regards, Vladimir Yudovin, Winguzone - Hosted Cloud Cassandra Launch your cluster in minutes. ---- On Mon, 21 Nov 2016 05:13:12 -0500Vincent Rischmann <m...@vrischmann.me> wrote ---- Hello, we have a 8 node Cassandra 2.1.15 cluster at work which is giving us a lot of trouble lately. The problem is simple: nodes regularly die because of an out of memory exception or the Linux OOM killer decides to kill the process. For a couple of weeks now we increased the heap to 20Gb hoping it would solve the out of memory errors, but in fact it didn't; instead of getting out of memory exception the OOM killer killed the JVM. We reduced the heap on some nodes to 8Gb to see if it would work better, but some nodes crashed again with out of memory exception. I suspect some of our tables are badly modelled, which would cause Cassandra to allocate a lot of data, however I don't how to prove that and/or find which table is bad, and which query is responsible. I tried looking at metrics in JMX, and tried profiling using mission control but it didn't really help; it's possible I missed it because I have no idea what to look for exactly. Anyone have some advice for troubleshooting this ? Thanks.