The place to start is with the statistics Cassandra logs after each GC. On Tue, May 31, 2011 at 5:01 AM, Sasha Dolgy <sdo...@gmail.com> wrote: > hi everyone, > > the current nodes i have deployed (4) have all been working fine, with > not a lot of data ... more reads than writes at the moment. as i had > monitoring disabled, when one node's OS killed the cassandra process > due to out of memory problems ... that was fine. 24 hours later, > another node, 24 hours later, another node ...until finally, all 4 > nodes no longer had cassandra running. > > When all nodes are started fresh, CPU utilization is at about 21% on > each box. after 24 hours, this goes up to 32% and then 51% 24 hours > later. > > originally I had thought that this may be a result of 'nodetool > repair' not being run consistently ... after adding a cronjob to run > every 24 hours (staggered between nodes) the problem of the increasing > memory utilization does not resolve. > > i've read the operations page and also the > http://wiki.apache.org/cassandra/MemtableThresholds page. i am > running defaults and 0.7.6-02 ... > > what are the best places to start in terms of finding why this is > happening? CF design / usage? 'nodetool cfstats' gives me some good > info ... and i've already implemented some changes to one CF based on > how it had ballooned (too many rows versus not enough columns) > > suggestions appreciated > > -- > Sasha Dolgy > sasha.do...@gmail.com >
-- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com