Thanks for the suggestions but I'd already removed the compression when your message came thru. That alleviated the problem but didn't solve it. I'm still looking at a few other possible causes, I'll post back if I work out what's going on, for now I am running rolling repairs to avoid another outage.
On Sun, Mar 11, 2012 at 6:32 PM, Edward Capriolo <edlinuxg...@gmail.com>wrote: > One thing you may want to look at is the meanRowSize from nodetool > cfstats and your compression block size. In our case the mean > compacted size is 560 bytes and 64KB block size caused CPU tickets and > a lot of short lived memory. I have brought by block size down to 16K. > The result tables are not noticeably larger and less memory pressure > on the young gen. I might try going down to 4 K next. > > On Sat, Mar 10, 2012 at 5:38 PM, Edward Capriolo <edlinuxg...@gmail.com> > wrote: > > The only downside of compression is it does cause more memory > > pressure. I can imagine something like repair could confound this. > > Since it would seem like building the merkle tree would involve > > decompressing every block on disk. > > > > I have been attempting to determine if the block size being larger or > > smaller has any effect on memory pressure. > > > > On Sat, Mar 10, 2012 at 4:50 PM, Peter Schuller > > <peter.schul...@infidyne.com> wrote: > >>> However, when I run a repair my CMS usage graph no longer shows sudden > drops > >>> but rather gradual slopes and only manages to clear around 300MB each > GC. > >>> This seems to occur on 2 other nodes in my cluster around the same > time, I > >>> assume this is because they're the replicas (we use 3 replicas). Parnew > >>> collections look about the same on my graphs with or without repair > running > >>> so no trouble there so far as I can tell. > >> > >> I don't know why leveled/snappy would affect it, but disregarding > >> that, I would have been suggesting that you are seeing additional heap > >> usage because of long-running repairs retaining sstables and delaying > >> their unload/removal (index sampling/bloom filters filling your heap). > >> If it really only happens for leveled/snappy however, I don't know > >> what that might be caused by. > >> > >> -- > >> / Peter Schuller (@scode, http://worldmodscode.wordpress.com) >