What's the output of nodetool cfstats for those 2 column families on cassNode2 and cassNode3? And what is the replication factor for this cluster?
Per the previous reply, nodetool ring should show each of your nodes with ~16.7% of the data if well balanced. Also, the auto-detection for memory sizes in the startup script is a little off w/r/t m1.xlarge because of the 'slightly less than 16gb' of ram. It usually ends up allocating 4g/400m (max and young) whereas 8g/800m will give you some more breathing room. On Wed, Jan 30, 2013 at 12:07 PM, Bryan Talbot <btal...@aeriagames.com> wrote: > My guess is that those one or two nodes with the gc pressure also have more > rows in your big CF. More rows could be due to imbalanced distribution if > your'e not using a random partitioner or from those nodes not yet removing > deleted rows which other nodes may have done. > > JVM heap space is used for a few things which scale with key count > including: > - bloom filter (for C* < 1.2) > - index samples > > Other space is used but can be more easily controlled by tuning for > - memtable > - compaction > - key cache > - row cache > > > So, if those nodes have more rows (check using "nodetool ring" or "nodetool > cfstats") than the others you can try to: > - reduce the number of rows by adding nodes, run manual / tune compactions > to remove rows with expired tombstones, etc. > - increase bloom filter fp chance > - increase jvm heap size (don't go too big) > - disable key or row cache > - increase index sample interval > > Not all of those things are generally good especially to the extreme so > don't go setting a 20 GB jvm heap without understanding the consequences for > example. > > -Bryan > > > On Wed, Jan 30, 2013 at 3:47 AM, Guillermo Barbero > <guillermo.barb...@spotbros.com> wrote: >> >> Hi, >> >> I'm viewing a weird behaviour in my cassandra cluster. Most of the >> warning messages are due to Heap is % full. According to this link >> >> (http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Cassndra-1-0-6-GC-query-tt7323457.html) >> there are two ways to "reduce pressure": >> 1. Decrease the cache sizes >> 2. Increase the index interval size >> >> Most of the flushes are in two column families (users and messages), I >> guess that's because the most mutations are there. >> >> I still have not applied those changes to the production environment. >> Do you recommend any other meassure? Should I set specific tunning for >> these two CFs? Should I check another metric? >> >> Additionally, the distribution of warning messages is not uniform >> along the cluster. Why could cassandra be doing this? What should I do >> to find out how to fix this? >> >> cassandra runs on a 6 node cluster of m1.xlarge machines (Amazon EC2) >> the java version is the following: >> java version "1.6.0_37" >> Java(TM) SE Runtime Environment (build 1.6.0_37-b06) >> Java HotSpot(TM) 64-Bit Server VM (build 20.12-b01, mixed mode) >> >> The cassandra system.log is resumed here (numer of messages, cassandra >> node, class that reports the message, first word of the message) >> 2013-01-26 >> 5 cassNode0: GCInspector.java Heap >> 5 cassNode0: StorageService.java Flushing >> 232 cassNode2: GCInspector.java Heap >> 232 cassNode2: StorageService.java Flushing >> 104 cassNode3: GCInspector.java Heap >> 104 cassNode3: StorageService.java Flushing >> 3 cassNode4: GCInspector.java Heap >> 3 cassNode4: StorageService.java Flushing >> 3 cassNode5: GCInspector.java Heap >> 3 cassNode5: StorageService.java Flushing >> >> 2013-01-27 >> 2 cassNode0: GCInspector.java Heap >> 2 cassNode0: StorageService.java Flushing >> 3 cassNode1: GCInspector.java Heap >> 3 cassNode1: StorageService.java Flushing >> 189 cassNode2: GCInspector.java Heap >> 189 cassNode2: StorageService.java Flushing >> 104 cassNode3: GCInspector.java Heap >> 104 cassNode3: StorageService.java Flushing >> 1 cassNode4: GCInspector.java Heap >> 1 cassNode4: StorageService.java Flushing >> 1 cassNode5: GCInspector.java Heap >> 1 cassNode5: StorageService.java Flushing >> >> 2013-01-28 >> 2 cassNode0: GCInspector.java Heap >> 2 cassNode0: StorageService.java Flushing >> 1 cassNode1: GCInspector.java Heap >> 1 cassNode1: StorageService.java Flushing >> 1 cassNode2: AutoSavingCache.java Reducing >> 343 cassNode2: GCInspector.java Heap >> 342 cassNode2: StorageService.java Flushing >> 181 cassNode3: GCInspector.java Heap >> 181 cassNode3: StorageService.java Flushing >> 4 cassNode4: GCInspector.java Heap >> 4 cassNode4: StorageService.java Flushing >> 3 cassNode5: GCInspector.java Heap >> 3 cassNode5: StorageService.java Flushing >> >> 2013-01-29 >> 2 cassNode0: GCInspector.java Heap >> 2 cassNode0: StorageService.java Flushing >> 3 cassNode1: GCInspector.java Heap >> 3 cassNode1: StorageService.java Flushing >> 156 cassNode2: GCInspector.java Heap >> 156 cassNode2: StorageService.java Flushing >> 71 cassNode3: GCInspector.java Heap >> 71 cassNode3: StorageService.java Flushing >> 2 cassNode4: GCInspector.java Heap >> 2 cassNode4: StorageService.java Flushing >> 2 cassNode5: GCInspector.java Heap >> 1 cassNode5: Memtable.java setting >> 2 cassNode5: StorageService.java Flushing >> >> -- >> >> Guillermo Barbero - Backend Team >> >> Spotbros Technologies > > >