I took the reset the world approach, things are much better now and the hints 
table is staying empty.  Bit disconcerting that it could get so large and not 
be able to recover itself, but at least there was a solution.  Thanks


From: aaron morton [mailto:aa...@thelastpickle.com]
Sent: Thursday, March 15, 2012 7:24 PM
To: user@cassandra.apache.org
Subject: Re: Large hints column family

These messages make it look like the node is having trouble delivering hints.
INFO [HintedHandoff:1] 2012-03-13 16:13:34,188 HintedHandOffManager.java (line 
284) Endpoint /192.168.20.4 died before hint delivery, aborting
INFO [HintedHandoff:1] 2012-03-13 17:03:50,986 HintedHandOffManager.java (line 
354) Timed out replaying hints to /192.168.20.3; aborting further deliveries

Take another look at the logs on this machine and on 20.4 and 20.3.

I would be looking int why so many hints are been stored. GC ? are there also 
logs about dropped messages ?

If you want to reset the world, make sure the nodes have all run repair and 
then drop the hints. Either via JMX or stopped in the node and deleting the 
files on disk.

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 16/03/2012, at 12:58 PM, Bryce Godfrey wrote:


We were having some occasional memory pressure issues, but we just added some 
more RAM a few days ago to the nodes and things are running more smoothly now, 
but in general nodes have not been going up and down.

I tried to do a "list HintsColumnFamily" from Cassandra-cli and it locks my 
Cassandra node and never returns, forcing me to kill the Cassandra process and 
restart it to get the node back.

Here is my settings which I believe are default since I don't remember changing 
them:

hinted_handoff_enabled: true
max_hint_window_in_ms: 3600000 # one hour
hinted_handoff_throttle_delay_in_ms: 50

Greping for Hinted in system log I get these
INFO [HintedHandoff:1] 2012-03-13 16:13:22,215 HintedHandOffManager.java (line 
373) Finished hinted handoff of 852703 rows to endpoint /192.168.20.3
INFO [HintedHandoff:1] 2012-03-13 16:13:34,188 HintedHandOffManager.java (line 
284) Endpoint /192.168.20.4 died before hint delivery, aborting
INFO [ScheduledTasks:1] 2012-03-13 16:15:32,569 StatusLogger.java (line 65) 
HintedHandoff                     1         1         0
INFO [HintedHandoff:1] 2012-03-13 16:15:44,362 HintedHandOffManager.java (line 
296) Started hinted handoff for token: 113427455640312814857969558651062452224 
with IP: /192.168.20.3
INFO [HintedHandoff:1] 2012-03-13 16:21:37,266 HintedHandOffManager.java (line 
296) Started hinted handoff for token: 113427455640312814857969558651062452224 
with IP: /192.168.20.3
INFO [ScheduledTasks:1] 2012-03-13 16:23:07,662 StatusLogger.java (line 65) 
HintedHandoff                     1         2         0
INFO [ScheduledTasks:1] 2012-03-13 16:25:49,330 StatusLogger.java (line 65) 
HintedHandoff                     1         2         0
INFO [ScheduledTasks:1] 2012-03-13 16:30:52,503 StatusLogger.java (line 65) 
HintedHandoff                     1         2         0
INFO [ScheduledTasks:1] 2012-03-13 16:42:22,202 StatusLogger.java (line 65) 
HintedHandoff                     1         2         0
INFO [HintedHandoff:1] 2012-03-13 17:03:50,986 HintedHandOffManager.java (line 
354) Timed out replaying hints to /192.168.20.3; aborting further deliveries
INFO [HintedHandoff:1] 2012-03-13 17:03:50,986 ColumnFamilyStore.java (line 
704) Enqueuing flush of Memtable-HintsColumnFamily@661547256(34298224/74465815 
serialized/live bytes, 78808 ops)
INFO [HintedHandoff:1] 2012-03-13 17:11:00,098 HintedHandOffManager.java (line 
373) Finished hinted handoff of 44160 rows to endpoint /192.168.20.3
INFO [HintedHandoff:1] 2012-03-13 17:11:36,596 HintedHandOffManager.java (line 
296) Started hinted handoff for token: 56713727820156407428984779325531226112 
with IP: /192.168.20.4
INFO [ScheduledTasks:1] 2012-03-13 17:12:25,248 StatusLogger.java (line 65) 
HintedHandoff                     1         2         0
INFO [HintedHandoff:1] 2012-03-13 18:47:56,151 HintedHandOffManager.java (line 
296) Started hinted handoff for token: 113427455640312814857969558651062452224 
with IP: /192.168.20.3
INFO [ScheduledTasks:1] 2012-03-13 18:50:24,326 StatusLogger.java (line 65) 
HintedHandoff                     1         2         0
INFO [ScheduledTasks:1] 2012-03-14 12:12:48,177 StatusLogger.java (line 65) 
HintedHandoff                     1         2         0
INFO [ScheduledTasks:1] 2012-03-14 12:13:57,685 StatusLogger.java (line 65) 
HintedHandoff                     1         2         0
INFO [ScheduledTasks:1] 2012-03-14 12:14:57,258 StatusLogger.java (line 65) 
HintedHandoff                     1         2         0
INFO [ScheduledTasks:1] 2012-03-14 12:14:58,260 StatusLogger.java (line 65) 
HintedHandoff                     1         2         0
INFO [ScheduledTasks:1] 2012-03-14 12:15:59,093 StatusLogger.java (line 65) 
HintedHandoff                     1         2         0
INFO [ScheduledTasks:1] 2012-03-14 12:16:59,428 StatusLogger.java (line 65) 
HintedHandoff                     1         2         0
INFO [ScheduledTasks:1] 2012-03-14 12:18:01,862 StatusLogger.java (line 65) 
HintedHandoff                     1         2         0
INFO [ScheduledTasks:1] 2012-03-14 12:18:01,898 StatusLogger.java (line 65) 
HintedHandoff                     1         2         0
INFO [ScheduledTasks:1] 2012-03-14 12:19:04,527 StatusLogger.java (line 65) 
HintedHandoff                     1         2         0
INFO [ScheduledTasks:1] 2012-03-14 12:19:04,541 StatusLogger.java (line 65) 
HintedHandoff                     1         2         0
INFO [ScheduledTasks:1] 2012-03-14 12:20:07,712 StatusLogger.java (line 65) 
HintedHandoff                     1         2         0
INFO [ScheduledTasks:1] 2012-03-14 12:20:08,332 StatusLogger.java (line 65) 
HintedHandoff                     1         2         0
INFO [HintedHandoff:1] 2012-03-14 12:27:13,033 HintedHandOffManager.java (line 
296) Started hinted handoff for token: 113427455640312814857969558651062452224 
with IP: /192.168.20.3
INFO [ScheduledTasks:1] 2012-03-15 15:05:00,954 StatusLogger.java (line 65) 
HintedHandoff                     1         2         0
INFO [HintedHandoff:1] 2012-03-15 15:06:07,750 HintedHandOffManager.java (line 
354) Timed out replaying hints to /192.168.20.3; aborting further deliveries
INFO [ScheduledTasks:1] 2012-03-15 15:06:07,802 StatusLogger.java (line 65) 
HintedHandoff                     1         2         0
INFO [HintedHandoff:1] 2012-03-15 15:06:07,809 ColumnFamilyStore.java (line 
704) Enqueuing flush of Memtable-HintsColumnFamily@254668880(103911/8312880 
serialized/live bytes, 63877 ops)
INFO [ScheduledTasks:1] 2012-03-15 15:07:13,503 StatusLogger.java (line 65) 
HintedHandoff                     1         2         0
INFO [HintedHandoff:1] 2012-03-15 15:15:43,842 HintedHandOffManager.java (line 
296) Started hinted handoff for token: 113427455640312814857969558651062452224 
with IP: /192.168.20.3


From: aaron morton 
[mailto:aa...@thelastpickle.com]<mailto:[mailto:aa...@thelastpickle.com]>
Sent: Thursday, March 15, 2012 1:51 AM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Re: Large hints column family

Is there anything going on in the logs ? Are nodes going up and down ? Can you 
see any messages about delivering hints ?

If the query to read the hints errors it will log "HintsCF getEPPendingHints 
timed out" at INFO level.

Also checking, do the hinted_handoff_*  settings in cassandra.yaml have their 
default settings ?

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 15/03/2012, at 8:35 AM, Bryce Godfrey wrote:



Forgot to mention that this is on 1.0.8

From: Bryce Godfrey 
[mailto:bryce.godf...@azaleos.com]<mailto:[mailto:bryce.godf...@azaleos.com]>
Sent: Wednesday, March 14, 2012 12:34 PM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Large hints column family

The system HintsColumnFamily seems large in my cluster, and I want to track 
down why that is.  I try invoking "listEndpointsPendingHints()" for 
o.a.c.db.HintedHandoffManager and it never returns, and also freezes the node 
that its invoked against.  It's a 3 node cluster, and all nodes have been up 
and running without issue for a while.  Any help on where to start with this?

               Column Family: HintsColumnFamily
                SSTable count: 11
                Space used (live): 11271669539
                Space used (total): 11271669539
                Number of Keys (estimate): 1408
                Memtable Columns Count: 338
                Memtable Data Size: 0
                Memtable Switch Count: 1
                Read Count: 3
                Read Latency: 4354.669 ms.
                Write Count: 848
                Write Latency: 0.029 ms.
                Pending Tasks: 0
                Bloom Filter False Postives: 0
                Bloom Filter False Ratio: 0.00000
                Bloom Filter Space Used: 12656
                Key cache capacity: 14
                Key cache size: 11
                Key cache hit rate: 0.6666666666666666
                Row cache: disabled
                Compacted row minimum size: 105779
                Compacted row maximum size: 7152383774
                Compacted row mean size: 590818614

Thanks,
Bryce

Reply via email to