Hello, Just to wrap up on my part of this thread, tuning CMS compaction threshold (-XX:CMSInitiatingOccupancyFraction) to 70 appears to resolved my issues with the memory warnings. However, I don't believe this would be a solution to all the issues mentioned below. Although, it does make sense to me tune this value below the "flush_largest_memtables_at" value in cassandra.yaml so CMS compaction will kick in before we start flushing memtables to free memory.
Thanks! -Mike On Apr 23, 2013, at 12:47 PM, Haithem Jarraya wrote: > We are facing similar issue, and we are not able to have the ring stable. We > are using C*1.2.3 on Centos6, 32GB - RAM, 8GB-heap, 6 Nodes. > The total data ~ 84gb (which is relatively small for C* to handle, with a RF > of 3). Our application is heavy read, we see the GC complaints in all nodes, > I copied and past the output below. > Also we usually see much larger values for the Pending - ReadStage, not sure > what is the best advice for this. > > Thanks, > > Haithem > > INFO [ScheduledTasks:1] 2013-04-23 16:40:02,118 GCInspector.java (line 119) > GC for ConcurrentMarkSweep: 911 ms for 1 collections, 5945542968 used; max is > 8199471104 > INFO [ScheduledTasks:1] 2013-04-23 16:40:16,051 GCInspector.java (line 119) > GC for ConcurrentMarkSweep: 322 ms for 1 collections, 5639896576 used; max is > 8199471104 > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,829 GCInspector.java (line 119) > GC for ConcurrentMarkSweep: 2273 ms for 1 collections, 6762618136 used; max > is 8199471104 > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,830 StatusLogger.java (line 53) > Pool Name Active Pending Blocked > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,830 StatusLogger.java (line 68) > ReadStage 4 4 0 > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,831 StatusLogger.java (line 68) > RequestResponseStage 1 6 0 > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,831 StatusLogger.java (line 68) > ReadRepairStage 0 0 0 > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,831 StatusLogger.java (line 68) > MutationStage 0 0 0 > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,831 StatusLogger.java (line 68) > ReplicateOnWriteStage 0 0 0 > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,832 StatusLogger.java (line 68) > GossipStage 0 0 0 > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,832 StatusLogger.java (line 68) > AntiEntropyStage 0 0 0 > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,832 StatusLogger.java (line 68) > MigrationStage 0 0 0 > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,832 StatusLogger.java (line 68) > MemtablePostFlusher 0 0 0 > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,833 StatusLogger.java (line 68) > FlushWriter 0 0 0 > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,833 StatusLogger.java (line 68) > MiscStage 0 0 0 > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,833 StatusLogger.java (line 68) > commitlog_archiver 0 0 0 > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,834 StatusLogger.java (line 68) > InternalResponseStage 0 0 0 > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,834 StatusLogger.java (line 68) > AntiEntropySessions 0 0 0 > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,834 StatusLogger.java (line 68) > HintedHandoff 0 0 0 > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,843 StatusLogger.java (line 73) > CompactionManager 0 0 > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,844 StatusLogger.java (line 85) > MessagingService n/a 15,1 > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,844 StatusLogger.java (line 95) > Cache Type Size Capacity > KeysToSave Provider > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,844 StatusLogger.java (line 96) > KeyCache 251658064 251658081 > all > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,844 StatusLogger.java (line 102) > RowCache 0 0 > all org.apache.cassandra.cache.SerializingCacheProvider > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,844 StatusLogger.java (line 109) > ColumnFamily Memtable ops,data > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,845 StatusLogger.java (line 112) > system.local 0,0 > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,845 StatusLogger.java (line 112) > system.peers 0,0 > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,845 StatusLogger.java (line 112) > system.batchlog 0,0 > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,845 StatusLogger.java (line 112) > system.NodeIdInfo 0,0 > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,846 StatusLogger.java (line 112) > system.LocationInfo 0,0 > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,846 StatusLogger.java (line 112) > system.Schema 0,0 > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,846 StatusLogger.java (line 112) > system.Migrations 0,0 > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,846 StatusLogger.java (line 112) > system.schema_keyspaces 0,0 > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,846 StatusLogger.java (line 112) > system.schema_columns 0,0 > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,846 StatusLogger.java (line 112) > system.schema_columnfamilies 0,0 > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,847 StatusLogger.java (line 112) > system.IndexInfo 0,0 > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,847 StatusLogger.java (line 112) > system.range_xfers 0,0 > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,847 StatusLogger.java (line 112) > system.peer_events 0,0 > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,847 StatusLogger.java (line 112) > system.hints 0,0 > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,847 StatusLogger.java (line 112) > system.HintsColumnFamily 0,0 > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,848 StatusLogger.java (line 112) > x.foo 0,0 > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,848 StatusLogger.java (line 112) > x.foo2 0,0 > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,848 StatusLogger.java (line 112) > x.foo3 0,0 > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,848 StatusLogger.java (line 112) > x.foo4 0,0 > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,848 StatusLogger.java (line 112) > x.foo5 0,0 > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,849 StatusLogger.java (line 112) > x.foo6 0,0 > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,849 StatusLogger.java (line 112) > x.foo7 0,0 > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,849 StatusLogger.java (line 112) > system_auth.users 0,0 > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,849 StatusLogger.java (line 112) > system_traces.sessions 0,0 > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,849 StatusLogger.java (line 112) > system_traces.events 0,0 > WARN [ScheduledTasks:1] 2013-04-23 16:40:30,850 GCInspector.java (line 142) > Heap is 0.824762725573964 full. You may need to reduce memtable and/or cache > sizes. Cassandra will now flush up to the two largest memtables to free up > memory. Adjust flush_largest_memtables_at threshold in cassandra.yaml if you > don't want Cassandra to do this automatically > INFO [ScheduledTasks:1] 2013-04-23 16:40:30,850 StorageService.java (line > 3537) Unable to reduce heap usage since there are no dirty column families > > > > > On 23 April 2013 16:52, Ralph Goers <ralph.go...@dslextreme.com> wrote: > We are using DSE, which I believe is also 1.1.9. We have basically had a > non-usable cluster for months due to this error. In our case, once it starts > doing this it starts flushing sstables to disk and eventually fills up the > disk to the point where it can't compact. If we catch it soon enough and > restart the node it usually can recover. > > In our case, the heap size is 12 GB. As I understand it Cassandra will give > 1/3 of that for sstables. I then noticed that we have one column family that > is using nearly 4GB in bloom filters on each node. Since the nodes will > start doing this when the heap reaches 9GB we essentially only have 1GB of > free memory so when compactions, cleanups, etc take place this situation > starts happening. We are working to change our data model to try to resolve > this. > > Ralph > > On Apr 19, 2013, at 8:00 AM, Michael Theroux wrote: > > > Hello, > > > > We've recently upgraded from m1.large to m1.xlarge instances on AWS to > > handle additional load, but to also relieve memory pressure. It appears to > > have accomplished both, however, we are still getting a warning, 0-3 times > > a day, on our database nodes: > > > > WARN [ScheduledTasks:1] 2013-04-19 14:17:46,532 GCInspector.java (line 145) > > Heap is 0.7529240824406468 full. You may need to reduce memtable and/or > > cache sizes. Cassandra will now flush up to the two largest memtables to > > free up memory. Adjust flush_largest_memtables_at threshold in > > cassandra.yaml if you don't want Cassandra to do this automatically > > > > This is happening much less frequently than before the upgrade, but after > > essentially doubling the amount of available memory, I'm curious on what I > > can do to determine what is happening during this time. > > > > I am collecting all the JMX statistics. Memtable space is elevated but not > > extraordinarily high. No GC messages are being output to the log. > > > > These warnings do seem to be occurring doing compactions of column families > > using LCS with wide rows, but I'm not sure there is a direct correlation. > > > > We are running Cassandra 1.1.9, with a maximum heap of 8G. > > > > Any advice? > > Thanks, > > -Mike > >