We have a 28 node cluster, out of which only one node is experiencing timeouts. We thought it was the raid, but there are two other nodes on the same raid without any problem. Also The problem goes away if we reboot the node, and then reappears after seven days. The following hinted hand-off timeouts are seen on the node experiencing the timeouts. Also we did not notice any gossip errors.
I was wondering if anyone has seen this issue and how they resolved it. Cassandra Version: 1.2.15.1 OS: Linux cm 2.6.32-504.8.1.el6.x86_64 #1 SMP Fri Dec 19 12:09:25 EST 2014 x86_64 x86_64 x86_64 GNU/Linux java version "1.6.0_85" ------------------------------------------------------------------------------------------------------------------------------------ INFO [HintedHandoff:2] 2015-06-17 22:52:08,130 HintedHandOffManager.java (line 296) Started hinted handoff for host: 4fe86051-6bca-4c28-b09c-1b0f073c1588 with IP: /192.168.1.122 INFO [HintedHandoff:1] 2015-06-17 22:52:08,131 HintedHandOffManager.java (line 296) Started hinted handoff for host: bbf0878b-b405-4518-b649-f6cf7c9a6550 with IP: /192.168.1.119 INFO [HintedHandoff:2] 2015-06-17 22:52:17,634 HintedHandOffManager.java (line 422) Timed out replaying hints to /192.168.1.122; aborting (0 delivered) INFO [HintedHandoff:2] 2015-06-17 22:52:17,635 HintedHandOffManager.java (line 296) Started hinted handoff for host: f7b7ab10-4d42-4f0c-af92-2934a075bee3 with IP: /192.168.1.108 INFO [HintedHandoff:1] 2015-06-17 22:52:17,643 HintedHandOffManager.java (line 422) Timed out replaying hints to /192.168.1.119; aborting (0 delivered) INFO [HintedHandoff:1] 2015-06-17 22:52:17,643 HintedHandOffManager.java (line 296) Started hinted handoff for host: ddb79f35-3e2b-4be8-84d8-7942086e2b73 with IP: /192.168.1.104 INFO [HintedHandoff:2] 2015-06-17 22:52:27,143 HintedHandOffManager.java (line 422) Timed out replaying hints to /192.168.1.108; aborting (0 delivered) INFO [HintedHandoff:2] 2015-06-17 22:52:27,144 HintedHandOffManager.java (line 296) Started hinted handoff for host: 6a2fa431-4a51-44cb-af19-1991c960e075 with IP: /192.168.1.117 INFO [HintedHandoff:1] 2015-06-17 22:52:27,153 HintedHandOffManager.java (line 422) Timed out replaying hints to /192.168.1.104; aborting (0 delivered) INFO [HintedHandoff:1] 2015-06-17 22:52:27,154 HintedHandOffManager.java (line 296) Started hinted handoff for host: cf03174a-533c-44d6-a679-e70090ad2bc5 with IP: /192.168.1.107 ------------------------------------------------------------------------------------------------------------------------------------ Thanks -shashi..