We are facing similar issue, and we are not able to have the ring stable. We are using C*1.2.3 on Centos6, 32GB - RAM, 8GB-heap, 6 Nodes. The total data ~ 84gb (which is relatively small for C* to handle, with a RF of 3). Our application is heavy read, we see the GC complaints in all nodes, I copied and past the output below. Also we usually see much larger values for the Pending - ReadStage, not sure what is the best advice for this.
Thanks, Haithem INFO [ScheduledTasks:1] 2013-04-23 16:40:02,118 GCInspector.java (line 119) GC for ConcurrentMarkSweep: 911 ms for 1 collections, 5945542968 used; max is 8199471104 INFO [ScheduledTasks:1] 2013-04-23 16:40:16,051 GCInspector.java (line 119) GC for ConcurrentMarkSweep: 322 ms for 1 collections, 5639896576 used; max is 8199471104 INFO [ScheduledTasks:1] 2013-04-23 16:40:30,829 GCInspector.java (line 119) GC for ConcurrentMarkSweep: 2273 ms for 1 collections, 6762618136 used; max is 8199471104 INFO [ScheduledTasks:1] 2013-04-23 16:40:30,830 StatusLogger.java (line 53) Pool Name Active Pending Blocked INFO [ScheduledTasks:1] 2013-04-23 16:40:30,830 StatusLogger.java (line 68) ReadStage 4 4 0 INFO [ScheduledTasks:1] 2013-04-23 16:40:30,831 StatusLogger.java (line 68) RequestResponseStage 1 6 0 INFO [ScheduledTasks:1] 2013-04-23 16:40:30,831 StatusLogger.java (line 68) ReadRepairStage 0 0 0 INFO [ScheduledTasks:1] 2013-04-23 16:40:30,831 StatusLogger.java (line 68) MutationStage 0 0 0 INFO [ScheduledTasks:1] 2013-04-23 16:40:30,831 StatusLogger.java (line 68) ReplicateOnWriteStage 0 0 0 INFO [ScheduledTasks:1] 2013-04-23 16:40:30,832 StatusLogger.java (line 68) GossipStage 0 0 0 INFO [ScheduledTasks:1] 2013-04-23 16:40:30,832 StatusLogger.java (line 68) AntiEntropyStage 0 0 0 INFO [ScheduledTasks:1] 2013-04-23 16:40:30,832 StatusLogger.java (line 68) MigrationStage 0 0 0 INFO [ScheduledTasks:1] 2013-04-23 16:40:30,832 StatusLogger.java (line 68) MemtablePostFlusher 0 0 0 INFO [ScheduledTasks:1] 2013-04-23 16:40:30,833 StatusLogger.java (line 68) FlushWriter 0 0 0 INFO [ScheduledTasks:1] 2013-04-23 16:40:30,833 StatusLogger.java (line 68) MiscStage 0 0 0 INFO [ScheduledTasks:1] 2013-04-23 16:40:30,833 StatusLogger.java (line 68) commitlog_archiver 0 0 0 INFO [ScheduledTasks:1] 2013-04-23 16:40:30,834 StatusLogger.java (line 68) InternalResponseStage 0 0 0 INFO [ScheduledTasks:1] 2013-04-23 16:40:30,834 StatusLogger.java (line 68) AntiEntropySessions 0 0 0 INFO [ScheduledTasks:1] 2013-04-23 16:40:30,834 StatusLogger.java (line 68) HintedHandoff 0 0 0 INFO [ScheduledTasks:1] 2013-04-23 16:40:30,843 StatusLogger.java (line 73) CompactionManager 0 0 INFO [ScheduledTasks:1] 2013-04-23 16:40:30,844 StatusLogger.java (line 85) MessagingService n/a 15,1 INFO [ScheduledTasks:1] 2013-04-23 16:40:30,844 StatusLogger.java (line 95) Cache Type Size Capacity KeysToSave Provider INFO [ScheduledTasks:1] 2013-04-23 16:40:30,844 StatusLogger.java (line 96) KeyCache 251658064 251658081 all INFO [ScheduledTasks:1] 2013-04-23 16:40:30,844 StatusLogger.java (line 102) RowCache 0 0 all org.apache.cassandra.cache.SerializingCacheProvider INFO [ScheduledTasks:1] 2013-04-23 16:40:30,844 StatusLogger.java (line 109) ColumnFamily Memtable ops,data INFO [ScheduledTasks:1] 2013-04-23 16:40:30,845 StatusLogger.java (line 112) system.local 0,0 INFO [ScheduledTasks:1] 2013-04-23 16:40:30,845 StatusLogger.java (line 112) system.peers 0,0 INFO [ScheduledTasks:1] 2013-04-23 16:40:30,845 StatusLogger.java (line 112) system.batchlog 0,0 INFO [ScheduledTasks:1] 2013-04-23 16:40:30,845 StatusLogger.java (line 112) system.NodeIdInfo 0,0 INFO [ScheduledTasks:1] 2013-04-23 16:40:30,846 StatusLogger.java (line 112) system.LocationInfo 0,0 INFO [ScheduledTasks:1] 2013-04-23 16:40:30,846 StatusLogger.java (line 112) system.Schema 0,0 INFO [ScheduledTasks:1] 2013-04-23 16:40:30,846 StatusLogger.java (line 112) system.Migrations 0,0 INFO [ScheduledTasks:1] 2013-04-23 16:40:30,846 StatusLogger.java (line 112) system.schema_keyspaces 0,0 INFO [ScheduledTasks:1] 2013-04-23 16:40:30,846 StatusLogger.java (line 112) system.schema_columns 0,0 INFO [ScheduledTasks:1] 2013-04-23 16:40:30,846 StatusLogger.java (line 112) system.schema_columnfamilies 0,0 INFO [ScheduledTasks:1] 2013-04-23 16:40:30,847 StatusLogger.java (line 112) system.IndexInfo 0,0 INFO [ScheduledTasks:1] 2013-04-23 16:40:30,847 StatusLogger.java (line 112) system.range_xfers 0,0 INFO [ScheduledTasks:1] 2013-04-23 16:40:30,847 StatusLogger.java (line 112) system.peer_events 0,0 INFO [ScheduledTasks:1] 2013-04-23 16:40:30,847 StatusLogger.java (line 112) system.hints 0,0 INFO [ScheduledTasks:1] 2013-04-23 16:40:30,847 StatusLogger.java (line 112) system.HintsColumnFamily 0,0 INFO [ScheduledTasks:1] 2013-04-23 16:40:30,848 StatusLogger.java (line 112) x.foo 0,0 INFO [ScheduledTasks:1] 2013-04-23 16:40:30,848 StatusLogger.java (line 112) x.foo2 0,0 INFO [ScheduledTasks:1] 2013-04-23 16:40:30,848 StatusLogger.java (line 112) x.foo3 0,0 INFO [ScheduledTasks:1] 2013-04-23 16:40:30,848 StatusLogger.java (line 112) x.foo4 0,0 INFO [ScheduledTasks:1] 2013-04-23 16:40:30,848 StatusLogger.java (line 112) x.foo5 0,0 INFO [ScheduledTasks:1] 2013-04-23 16:40:30,849 StatusLogger.java (line 112) x.foo6 0,0 INFO [ScheduledTasks:1] 2013-04-23 16:40:30,849 StatusLogger.java (line 112) x.foo7 0,0 INFO [ScheduledTasks:1] 2013-04-23 16:40:30,849 StatusLogger.java (line 112) system_auth.users 0,0 INFO [ScheduledTasks:1] 2013-04-23 16:40:30,849 StatusLogger.java (line 112) system_traces.sessions 0,0 INFO [ScheduledTasks:1] 2013-04-23 16:40:30,849 StatusLogger.java (line 112) system_traces.events 0,0 WARN [ScheduledTasks:1] 2013-04-23 16:40:30,850 GCInspector.java (line 142) Heap is 0.824762725573964 full. You may need to reduce memtable and/or cache sizes. Cassandra will now flush up to the two largest memtables to free up memory. Adjust flush_largest_memtables_at threshold in cassandra.yaml if you don't want Cassandra to do this automatically INFO [ScheduledTasks:1] 2013-04-23 16:40:30,850 StorageService.java (line 3537) Unable to reduce heap usage since there are no dirty column families On 23 April 2013 16:52, Ralph Goers <ralph.go...@dslextreme.com> wrote: > We are using DSE, which I believe is also 1.1.9. We have basically had a > non-usable cluster for months due to this error. In our case, once it > starts doing this it starts flushing sstables to disk and eventually fills > up the disk to the point where it can't compact. If we catch it soon > enough and restart the node it usually can recover. > > In our case, the heap size is 12 GB. As I understand it Cassandra will > give 1/3 of that for sstables. I then noticed that we have one column > family that is using nearly 4GB in bloom filters on each node. Since the > nodes will start doing this when the heap reaches 9GB we essentially only > have 1GB of free memory so when compactions, cleanups, etc take place this > situation starts happening. We are working to change our data model to try > to resolve this. > > Ralph > > On Apr 19, 2013, at 8:00 AM, Michael Theroux wrote: > > > Hello, > > > > We've recently upgraded from m1.large to m1.xlarge instances on AWS to > handle additional load, but to also relieve memory pressure. It appears to > have accomplished both, however, we are still getting a warning, 0-3 times > a day, on our database nodes: > > > > WARN [ScheduledTasks:1] 2013-04-19 14:17:46,532 GCInspector.java (line > 145) Heap is 0.7529240824406468 full. You may need to reduce memtable > and/or cache sizes. Cassandra will now flush up to the two largest > memtables to free up memory. Adjust flush_largest_memtables_at threshold > in cassandra.yaml if you don't want Cassandra to do this automatically > > > > This is happening much less frequently than before the upgrade, but > after essentially doubling the amount of available memory, I'm curious on > what I can do to determine what is happening during this time. > > > > I am collecting all the JMX statistics. Memtable space is elevated but > not extraordinarily high. No GC messages are being output to the log. > > > > These warnings do seem to be occurring doing compactions of column > families using LCS with wide rows, but I'm not sure there is a direct > correlation. > > > > We are running Cassandra 1.1.9, with a maximum heap of 8G. > > > > Any advice? > > Thanks, > > -Mike > >