[ 
https://issues.apache.org/jira/browse/CASSANDRA-6346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-6346.
---------------------------------------

    Resolution: Not A Problem

The main knob to turn to make load shedding more aggressive is to reduce 
rpc_write_timeout.  (See CASSANDRA-6059)

> Cassandra 2.0 server node runs out of memory during writes/replications
> -----------------------------------------------------------------------
>
>                 Key: CASSANDRA-6346
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6346
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Nitin
>         Attachments: LinkedBlockingQ.png
>
>
> Currently we are running 18 node cassandra cluster with 
> NetworkTopologyReplication Strategy (d1 = 3 and d2=3).  
> Our severs seem to crash with OOM exceptions. Our heap size is 8Gb. However 
> while crashing i got hold of the hprof file and ran it through an eclipse MAT 
> analyzer
> After analyzing the hprof (please see attachment for top offenders), i find 
> that there is a linked blocking queue (from mutation stage) that seems to 
> hold about 7.3 Gb of the total 8Gb of ram. 
> After deep diving into the cassandra2.0 code, i see that every 
> update/write/replication goes through stages and mutation stage  and the no 
> of threads that flush this queue (I am assuming memtable to sstable write) is 
> controlled by concurrent writes. Ours is set to 32 concurrent writes
> However we observe node crashes even when there are 0 writes to the node but 
> replication requests are floating around the cluster. 
> Any ideas what are the knobs to throttle the size of these queues/max no of 
> write and replication requests a node can get? What are the recommended 
> settings to operate cassandra node in a mode where it rejects requests beyond 
> certain queue threshold?



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to