Thanks for the quick responses. @Yi Using Hector 0.8.0-1 Hardware is :
- AMD Opteron 4174 6x 2.30+ GHz - 32 Go DDR3 - 1 Gbps Lossless @aaron I'm running 0.8.6 on all nodes, straight from the debian packages. I get hinted handoffs from time to time because of flapping, I've planning to increase the phi as per another thread but haven't yet. Here are the HH per node:: HintedHandoff 0 0 437 0 0 HintedHandoff 0 0 2 0 0 HintedHandoff 0 0 1798 0 0 Not seeing any iowait from my munin CPU graph, at least not more than the past couple of weeks. There is a little more on the 2nd node because it's also holding a mysql database that gets hit hard. Munin iostats graph shows an average 10-20kilo blocks/s read/write. Nothing special was happening besides a weekly repair on node two starting yesterday at 4am. That one failed with ERROR [AntiEntropySessions:5] 2011-10-04 04:03:56,676 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[AntiEntropySessions:5,5,RMI Runtime] java.lang.RuntimeException: java.io.IOException: Problem during repair session manual-repair-0d4125f0-aab7-403b-a083-7c19ef6579b1, endpoint /xx.xx.xx.180 died Then the next planned repairs failed before starting INFO [AntiEntropySessions:8] 2011-10-04 04:03:57,137 AntiEntropyService.java (line 658) Could not proceed on repair because a neighbor (/xx.xx.xx.180) is dead: manual-repair-5dc4c221-3a15-4031-9aa8-0931e41816cd failed Looking at the logs on that node shows no Exception. And I was about to say, "nothing special happening at that time" except that it looks like at 4am, the GC started working hard and got the heap down to 9GB and then it shot straight up to almost 16GB so I guess ParNew couldn't keep up and ConcurrentMarkSweep had to step in and basically hang the server ? It took another 2 minutes until I get the "Heap is 0.75 full" message, I get a lot of StatusLogger messages before that. So it looks like computing the Merkle tree was very expensive this time... I wonder why ? Anything I can do to handle this ? INFO [ScheduledTasks:1] 2011-10-04 04:03:12,874 GCInspector.java (line 122) GC for ParNew: 427 ms for 2 collections, 16227154488 used; max is 16838033408 INFO [GossipTasks:1] 2011-10-04 04:04:24,092 Gossiper.java (line 697) InetAddress /xx.xx.xx.97 is now dead. INFO [ScheduledTasks:1] 2011-10-04 04:04:24,093 GCInspector.java (line 122) GC for ParNew: 26376 ms for 2 collections, 8832125416 used; max is 16838033408 (not GC logs until) INFO [ScheduledTasks:1] 2011-10-04 04:04:24,251 GCInspector.java (line 122) GC for ConcurrentMarkSweep: 16250 ms for 3 collections, 9076209720 used; max is 16838033408 WARN [ScheduledTasks:1] 2011-10-04 04:06:52,777 GCInspector.java (line 143) Heap is 0.752707974197173 full. You may need to reduce memtable and/or cache sizes. Cassandra will now flush up to the two largest memtables to free up memory. Adjust flush_largest_memtables_at threshold in cassandra.yaml if you don't want Cassandra to do this automatically 2011/10/5 aaron morton <aa...@thelastpickle.com> > Lots of hinted handoff can give you mutations… > > HintedHandoff 0 0 1798 > 0 0 > > > 1798 is somewhat high. This is the HH tasks on this node though, can you > see HH running on other nodes in the cluster? What has been happening on > this node ? > > HH is throttled to avoid this sort of thing, what version are you on ? > > Also looks like the disk IO could not keep up with the flushing…. > > FlushWriter 0 0 5714 > 0 499 > > You need to provide some more info on what was happening to nodes before > hand. And check the logs on all machines for errors etc. > > Cheers > > ----------------- > Aaron Morton > Freelance Cassandra Developer > @aaronmorton > http://www.thelastpickle.com > > On 5/10/2011, at 9:52 PM, Yi Yang wrote: > > Well what client are you using? And can you give a hint to your node > hardware? > > 從我的 BlackBerry® 無線裝置 > ------------------------------ > *From: * Philippe <watche...@gmail.com> > *Date: *Wed, 5 Oct 2011 10:33:21 +0200 > *To: *user<user@cassandra.apache.org> > *ReplyTo: * user@cassandra.apache.org > *Subject: *Why is mutation stage increasing ?? > > Hello, > I have my 3-node, RF=3 cluster acting strangely. Can someone shed a light > as to what is going on ? > It was stuck for a couple of hours (all clients TimedOut). nodetool tpstats > showed huge increasing MutationStages (in the hundreds of thousands). > I restarted one node and it took a while to reply GB of commitlog. I've > shutdown all clients that write to the cluster and it's just weird > > All nodes are still showing huge MutationStages including the new one and > it's either increasing or stable. The pending count is stuck at 32. > Compactionstats shows no compaction on 2 nodes and dozens of Scrub > compactions (all at 100%) on the 3rd one. This is a scrub I did last week > when I encountered assertion errors. > Netstats shows no streams being exchanged at any node but each on is > expecting a few Responses. > > Any ideas ? > Thanks > > For example (increased to 567062 while I was writing this email) > Pool Name Active Pending Completed Blocked All > time blocked > ReadStage 0 0 18372664517 > 0 0 > RequestResponseStage 0 0 10731370183 > 0 0 > MutationStage 32 565879 295492216 > 0 0 > ReadRepairStage 0 0 23654 > 0 0 > ReplicateOnWriteStage 0 0 7733659 > 0 0 > GossipStage 0 0 3502922 > 0 0 > AntiEntropyStage 0 0 1631 > 0 0 > MigrationStage 0 0 0 > 0 0 > MemtablePostFlusher 0 0 5716 > 0 0 > StreamStage 0 0 10 > 0 0 > FlushWriter 0 0 5714 > 0 499 > FILEUTILS-DELETE-POOL 0 0 773 > 0 0 > MiscStage 0 0 1266 > 0 0 > FlushSorter 0 0 0 > 0 0 > AntiEntropySessions 0 0 18 > 0 0 > InternalResponseStage 0 0 0 > 0 0 > HintedHandoff 0 0 1798 > 0 0 > > > Mode: Normal > Not sending any streams. > Not receiving any streams. > Pool Name Active Pending Completed > Commands n/a 0 1223769753 > Responses n/a 4 1627481305 > > >