Robert Coli <rcoli <at> digg.com> writes: > Check the size of the Hinted Handoff CF? If your nodes are flapping > under sustained write, they could be storing a non-trivial number of > hinted handoff rows? Probably not 5x usage though.. > > http://wiki.apache.org/cassandra/Operations > " > The reason why you run nodetool cleanup on all live nodes [after > replacing a node] is to remove old Hinted Handoff writes stored for > the dead node. > =Rob
Please see below and let me know if you think hinted handoff is to blame. I do see some down/up activity according to the gossiping on the nodes. Interestingly, I see mass "deaths" being detected on servers 1, 2, 4, 5, and 6. Each server detects the "mass death" at a unique time, making it look as though it's the server that's detecting the mass death that is really the culprit. Server3, the bloated node, is not having this problem. As far as nodes being reported as going down and coming back up (always quickly) -- being down/up is being reported for each server as follows: server1: 16 times server2: 17 times server3: 12 times server4: 18 times server5: 24 times server6: 20 times server7: 13 times server8: 12 times Again, server3 is looking like a very healthy node, so you wouldn't think it would have a backlog of hinted handoffs coming its way when the writes complete. Server5 and server6 seem to be the least healthy. Here's the streaming that took place on all 8 nodes, following the 800,000 row write (writes completed at 13:47): server1-system.log: INFO [Thread-25] 2010-08-16 12:59:49,858 StreamCompletionHandler.java (line 64) Streaming added /var/lib/cassandra/data/Keyspace1/Standard1-172-Data.db server1-system.log: INFO [Thread-31] 2010-08-16 18:12:01,829 StreamCompletionHandler.java (line 64) Streaming added /var/lib/cassandra/data/Keyspace1/Standard1-475-Data.db server3-system.log: INFO [Thread-25] 2010-08-16 14:34:06,827 StreamCompletionHandler.java (line 64) Streaming added /var/lib/cassandra/data/Keyspace1/Standard1-467-Data.db server3-system.log: INFO [Thread-28] 2010-08-16 15:38:34,685 StreamCompletionHandler.java (line 64) Streaming added /var/lib/cassandra/data/Keyspace1/Standard1-475-Data.db server3-system.log: INFO [Thread-34] 2010-08-16 17:24:57,584 StreamCompletionHandler.java (line 64) Streaming added /var/lib/cassandra/data/Keyspace1/Standard1-477-Data.db server3-system.log: INFO [Thread-31] 2010-08-16 17:26:37,281 StreamCompletionHandler.java (line 64) Streaming added /var/lib/cassandra/data/Keyspace1/Standard1-476-Data.db server4-system.log: INFO [Thread-25] 2010-08-16 18:30:19,313 StreamCompletionHandler.java (line 64) Streaming added /var/lib/cassandra/data/Keyspace1/Standard1-472-Data.db server4-system.log: INFO [Thread-28] 2010-08-16 18:33:07,141 StreamCompletionHandler.java (line 64) Streaming added /var/lib/cassandra/data/Keyspace1/Standard1-473-Data.db server6-system.log: INFO [Thread-25] 2010-08-16 16:53:43,108 StreamCompletionHandler.java (line 64) Streaming added /var/lib/cassandra/data/Keyspace1/Standard1-470-Data.db server6-system.log: INFO [Thread-28] 2010-08-16 17:58:15,031 StreamCompletionHandler.java (line 64) Streaming added /var/lib/cassandra/data/Keyspace1/Standard1-471-Data.db server7-system.log: INFO [Thread-25] 2010-08-16 12:39:54,342 StreamCompletionHandler.java (line 64) Streaming added /var/lib/cassandra/data/Keyspace1/Standard1-110-Data.db server7-system.log: INFO [Thread-28] 2010-08-16 12:46:16,067 StreamCompletionHandler.java (line 64) Streaming added /var/lib/cassandra/data/Keyspace1/Standard1-144-Data.db server7-system.log: INFO [Thread-31] 2010-08-16 14:51:24,585 StreamCompletionHandler.java (line 64) Streaming added /var/lib/cassandra/data/Keyspace1/Standard1-469-Data.db server7-system.log: INFO [Thread-34] 2010-08-16 15:17:22,168 StreamCompletionHandler.java (line 64) Streaming added /var/lib/cassandra/data/Keyspace1/Standard1-472-Data.db server8-system.log: INFO [Thread-25] 2010-08-16 12:46:39,462 StreamCompletionHandler.java (line 64) Streaming added /var/lib/cassandra/data/Keyspace1/Standard1-133-Data.db server8-system.log: INFO [Thread-28] 2010-08-16 15:56:38,124 StreamCompletionHandler.java (line 64) Streaming added /var/lib/cassandra/data/Keyspace1/Standard1-473-Data.db server8-system.log: INFO [Thread-31] 2010-08-16 17:05:24,805 StreamCompletionHandler.java (line 64) Streaming added /var/lib/cassandra/data/Keyspace1/Standard1-475-Data.db server8-system.log: INFO [Thread-34] 2010-08-16 18:52:03,416 StreamCompletionHandler.java (line 64) Streaming added /var/lib/cassandra/data/Keyspace1/Standard1-476-Data.db Do you see any red flags? Thanks for your help!