For background, a discussion on estimating working set http://www.mail-archive.com/user@cassandra.apache.org/msg25762.html . You can also just look at the size of tenured heap after a CMS.
Are you seeing lots of ParNew or CMS ? GC activity is a result of configuration *and* workload. Look in your data model for wide rows, or long lived rows that get a lot of deletes, and look in your code for large reads / writes (e.g. sometimes we read 100,000 columns from a row). > The number that really jumps out at me below is the number of Pending > requests for the Message Service. 24,000+ pending requests. > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,702 StatusLogger.java (line 89) > MessagingService n/a 24,229 Technically speaking that ain't right. The whole server looks unhappy. Are there any errors in the logs ? Are all the nodes up ? A very blunt approach is to reduce the in_memory_compaction_limit and the concurrent_compactors or compaction_throughput_mb_per_sec. This reduces the impact compaction and repair have on the system and may give you breathing space to look at other causes. Once you have a feel for what's going on you can turn them up. Hope that helps. A ----------------- Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 5/12/2012, at 7:04 AM, Mike <mthero...@yahoo.com> wrote: > Hello, > > Our Cassandra cluster has, relatively recently, started experiencing memory > pressure that I am in the midsts of diagnosing. Our system has uneven levels > of traffic, relatively light during the day, but extremely heavy during some > overnight processing. We have started getting a message: > > WARN [ScheduledTasks:1] 2012-12-04 09:08:58,579 GCInspector.java (line 145) > Heap is 0.7520105072262254 full. You may need to reduce memtable and/or > cache sizes. Cassandra will now flush up to the two largest memtables to > free up memory. Adjust flush_largest_memtables_at threshold in > cassandra.yaml if you don't want Cassandra to do this automatically > > I've started implementing some instrumentation to gather stats from JMX to > determine what is happening. However, last night, the GCInspector was kind > enough to log the information below. Couple of things jumped out at me. > > The maximum heap for the Cassandra is 4GB. We are running Cassandra 1.1.2, > on a 6 node cluster, with a replication factor of 3. All our queries use > LOCAL_QUORUM consistency. > > Adding up the caches + the memtable "data" in the trace below, comes to under > 600MB > > The number that really jumps out at me below is the number of Pending > requests for the Message Service. 24,000+ pending requests. > > Does this number represent the number of outstanding client requests that > this node is processing? If so, does this mean we potentially have 24,000 > responses being pulled into memory, thereby causing this memory issue? What > else should I look at? > > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,585 StatusLogger.java (line 57) > Pool Name Active Pending Blocked > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,695 StatusLogger.java (line 72) > ReadStage 32 66 0 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,696 StatusLogger.java (line 72) > RequestResponseStage 0 193 0 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,696 StatusLogger.java (line 72) > ReadRepairStage 0 0 0 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,696 StatusLogger.java (line 72) > MutationStage 2 2 0 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,697 StatusLogger.java (line 72) > ReplicateOnWriteStage 5 5 0 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,698 StatusLogger.java (line 72) > GossipStage 0 13 0 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,698 StatusLogger.java (line 72) > AntiEntropyStage 0 0 0 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,698 StatusLogger.java (line 72) > MigrationStage 0 0 0 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,699 StatusLogger.java (line 72) > StreamStage 0 0 0 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,699 StatusLogger.java (line 72) > MemtablePostFlusher 0 0 0 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,699 StatusLogger.java (line 72) > FlushWriter 0 0 0 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,700 StatusLogger.java (line 72) > MiscStage 0 0 0 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,700 StatusLogger.java (line 72) > commitlog_archiver 0 0 0 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,700 StatusLogger.java (line 72) > InternalResponseStage 0 0 0 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,701 StatusLogger.java (line 72) > AntiEntropySessions 0 0 0 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,701 StatusLogger.java (line 72) > HintedHandoff 0 0 0 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,702 StatusLogger.java (line 77) > CompactionManager 2 4 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,702 StatusLogger.java (line 89) > MessagingService n/a 24,229 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,702 StatusLogger.java (line 99) > Cache Type Size Capacity KeysToSave Provider > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,702 StatusLogger.java (line 100) > KeyCache 2184533 2184533 all > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,703 StatusLogger.java (line 106) > RowCache 52385581 52428800 all > org.apache.cassandra.cache.SerializingCacheProvider > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,703 StatusLogger.java (line 113) > ColumnFamily Memtable ops,data > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,703 StatusLogger.java (line 116) > system.NodeIdInfo 0,0 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,704 StatusLogger.java (line 116) > system.IndexInfo 0,0 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,705 StatusLogger.java (line 116) > system.LocationInfo 0,0 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,705 StatusLogger.java (line 116) > system.Versions 0,0 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,705 StatusLogger.java (line 116) > system.schema_keyspaces 0,0 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,705 StatusLogger.java (line 116) > system.Migrations 0,0 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,706 StatusLogger.java (line 116) > system.schema_columnfamilies 0,0 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,706 StatusLogger.java (line 116) > system.schema_columns 0,0 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,706 StatusLogger.java (line 116) > system.HintsColumnFamily 0,0 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,706 StatusLogger.java (line 116) > system.Schema 0,0 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,707 StatusLogger.java (line 116) > open.comp 0,0 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,707 StatusLogger.java (line 116) > open.bp 0,0 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,707 StatusLogger.java (line 116) > open.bn 312832,47184787 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,707 StatusLogger.java (line 116) > open.p 711,193201 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,707 StatusLogger.java (line 116) > open.bid 273064,46316018 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,708 StatusLogger.java (line 116) > open.rel 0,0 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,708 StatusLogger.java (line 116) > open.images 0,0 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,708 StatusLogger.java (line 116) > open.users 62287,86665510 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,708 StatusLogger.java (line 116) > open.sessions 4710,13153051 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,709 StatusLogger.java (line 116) > open.userIndices 4,1960 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,709 StatusLogger.java (line 116) > open.caches 50,4813457 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,709 StatusLogger.java (line 116) > open.content 0,0 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,710 StatusLogger.java (line 116) > open.enrich 30,20793 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,744 StatusLogger.java (line 116) > open.bt 1133,776831 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,863 StatusLogger.java (line 116) > open.alias 253,163933 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,864 StatusLogger.java (line 116) > open.bymsgid 249610,73075517 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,864 StatusLogger.java (line 116) > open.rank 319956,70898417 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,865 StatusLogger.java (line 116) > open.cmap 448,406193 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,865 StatusLogger.java (line 116) > open.pmap 659,566220 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,865 StatusLogger.java (line 116) > open.pict 50944,58659596 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,878 StatusLogger.java (line 116) > open.w 0,0 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,879 StatusLogger.java (line 116) > open.s 92395,46160381 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,879 StatusLogger.java (line 116) > open.bymrel 136607,57780555 > INFO [ScheduledTasks:1] 2012-12-04 09:00:37,879 StatusLogger.java (line 116) > open.m 26720,51150067 > > It's appreciated, > Thanks, > -Mike