What are your bloom filter settings on your CFs? Maybe look here: 
http://www.datastax.com/docs/1.1/operations/tuning#tuning-bloomfilters



On Nov 7, 2012, at 4:56 AM, Alain RODRIGUEZ wrote:

> Hi,
> 
> We just had some issue in production that we finally solve upgrading hardware 
> and increasing the heap.
> 
> Now we have 3 xLarge servers from AWS (15G RAM, 4 cpu - 8 cores). We add them 
> and then removed the old ones.
> 
> With full default configuration, 0.75 threshold of 4G was being reach 
> continuously, so I was obliged to increase the heap to 8G:
> 
> Memtable  : 2G (Manually configured)
> Key cache : 0.1G (min(5% of Heap (in MB), 100MB))
> System     : 1G     (more or less, from datastax doc)
> 
> It should use about 3 G and it actually use between 4 and 6 G.
> 
> So here are my questions:
> 
> How can we know how the heap is being used, monitor it ?
> Why have I that much memory used in the heap of my new servers ?
> 
> All configurations not specified are default from 1.1.2 Cassandra.
> 
> Here is what happen to us before, why we change our hardware, if you have any 
> clue on what happen we would be glad to learn and maybe come back to our old 
> hardware.
> 
> -------------------------------- User experience 
> ------------------------------------------------------------------------
> 
> We had a Cassandra 1.1.2 2 nodes cluster with RF2 and CL.ONE (R&W) running on 
> 2 m1.Large aws (7.5G RAM, 2 cpu - 4 cores dedicated to Cassandra only). 
> 
> Cassandra.yaml was configured with 1.1.2 default options and in 
> cassandra-env.sh I configured a 4G heap with a 200M "new size".
> 
> That is the heap that was supposed to be used.
> 
> Memtable  : 1.4G (1/3 of the heap)
> Key cache : 0.1G (min(5% of Heap (in MB), 100MB))
> System     : 1G     (more or less, from datastax doc)
> 
> So we are around 2.5G max in theory out of 3G usable (threshold 0.75 of the 
> heap before flushing memtable because of pressure)
> 
> I thought it was ok regarding Datastax documentation:
> 
> "Regardless of how much RAM your hardware has, you should keep the JVM heap 
> size constrained by the following formula and allow the operating system’s 
> file cache to do the rest:
> (memtable_total_space_in_mb) + 1GB + (cache_size_estimate)"
> 
> After adding a third node and changing the RF from 2 to 3 (to allow using 
> CL.QUORUM and still be able to restart a node whenever we want), things went 
> really bad. Even if I still don't get how any of these operations could 
> possibly affect the heap needed.
> 
> All the 3 nodes reached the 0.75 heap threshold (I tried to increase it to 
> 0.85, but it was still reached). And they never came down. So my cluster 
> started flushing a lot and the load increased because of unceasing 
> compactions. This unexpected load produced latency that broke down our 
> service for a while. Even with the service down, Cassandra was unable to 
> recover.
> 

Reply via email to