Re: OOM on Apache Cassandra on 30 Plus node at the same time

Priyanka Sat, 04 Mar 2017 10:22:45 -0800


Sent from my iPhone


> On Mar 3, 2017, at 12:18 PM, Shravan Ch <chall...@outlook.com> wrote:
> 
> Hello,
> 
> More than 30 plus Cassandra servers in the primary DC went down OOM exception 
> below. What puzzles me is the scale at which it happened (at the same 
> minute). I will share some more details below. 
> 
> System Log: http://pastebin.com/iPeYrWVR
> GC Log: http://pastebin.com/CzNNGs0r
> 
> During the OOM I saw lot of WARNings like the below (these were there for 
> quite sometime may be weeks)
> WARN  [SharedPool-Worker-81] 2017-03-01 19:55:41,209 BatchStatement.java:252 
> - Batch of prepared statements for [keyspace.table] is of size 225455, 
> exceeding specified threshold of 65536 by 159919.
> 
> Environment:
> We are using ApacheCassandra-2.1.9 on Multi DC cluster. Primary DC (more C* 
> nodes on SSD and apps run here)  and secondary DC (geographically remote and 
> more like a DR to primary) on SAS drives. 
> Cassandra config:
> 
> Java 1.8.0_65
> Garbage Collector: G1GC
> memtable_allocation_type: offheap_objects
> 
> Post this OOM I am seeing huge hints pile up on majority of the nodes and the 
> pending hints keep going up. I have increased HintedHandoff CoreThreads to 6 
> but that did not help (I admit that I tried this on one node to try).
> 
> nodetool compactionstats -H
> pending tasks: 3
> compaction type            keyspace                          table   
> completed      total    unit   progress
>         Compaction              system                          hints     
> 28.5 GB   92.38 GB   bytes     30.85%
> 
> 
> Appreciate your inputs here. 
> 
> Thanks,
> Shravan

Re: OOM on Apache Cassandra on 30 Plus node at the same time

Reply via email to