Re: Out of memory and/or OOM kill on a cluster

Alexander Dejanovski Mon, 21 Nov 2016 04:10:13 -0800

Hi Vincent,

one of the usual causes of OOMs is very large partitions.
Could you check your nodetool cfstats output in search of large partitions
? If you find one (or more), run nodetool cfhistograms on those tables to
get a view of the partition sizes distribution.


Thanks

On Mon, Nov 21, 2016 at 12:01 PM Vladimir Yudovin <[email protected]>
wrote:

> Did you try any value in the range 8-20 (e.g. 60-70% of physical memory).
> Also how many tables do you have across all keyspaces? Each table can
> consume minimum 1M of Java heap.
>
> Best regards, Vladimir Yudovin,
>
> *Winguzone <https://winguzone.com?from=list> - Hosted Cloud
> CassandraLaunch your cluster in minutes.*
>
>
> ---- On Mon, 21 Nov 2016 05:13:12 -0500*Vincent Rischmann
> <[email protected] <[email protected]>>* wrote ----
>
> Hello,
>
> we have a 8 node Cassandra 2.1.15 cluster at work which is giving us a lot
> of trouble lately.
>
> The problem is simple: nodes regularly die because of an out of memory
> exception or the Linux OOM killer decides to kill the process.
> For a couple of weeks now we increased the heap to 20Gb hoping it would
> solve the out of memory errors, but in fact it didn't; instead of getting
> out of memory exception the OOM killer killed the JVM.
>
> We reduced the heap on some nodes to 8Gb to see if it would work better,
> but some nodes crashed again with out of memory exception.
>
> I suspect some of our tables are badly modelled, which would cause
> Cassandra to allocate a lot of data, however I don't how to prove that
> and/or find which table is bad, and which query is responsible.
>
> I tried looking at metrics in JMX, and tried profiling using mission
> control but it didn't really help; it's possible I missed it because I have
> no idea what to look for exactly.
>
> Anyone have some advice for troubleshooting this ?
>
> Thanks.
>
> --
-----------------
Alexander Dejanovski
France
@alexanderdeja

Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

Re: Out of memory and/or OOM kill on a cluster

Reply via email to