Hello All,

A while ago we had 3 cassandra nodes on Amazon. At some point we decided to
buy some servers and deploy cassandra there. The problem is that since then
we have a list of dead IPs listed as UNREACHABLE nodes when we run describe
cluster on cassandra-cli.

I have seen other posts which describe similar issues, and the bottom line
is "it's harmless but if you want to get rid of it do a full cluster
restart" (I presume that means a rolling restart - not shut-down the entire
cluster right???). Anyway...

We also came across another solution: Install "libmx4j-java", uncomment the
respective line on "/etc/default/cassandra", restart the node, go to "
http://cassandra_node:8081/mbean?objectname=org.apache.cassandra.net%3Atype%3DGossiper";,
type in the dead IP/IPs next to the "unsafeAssassinateEndpoint" and invoke
it. So we did that on one of the nodes for the list of dead IPs. After
running "describe cluster" on the CLI on every node, we noticed that there
were no UNREACHABLE nodes and everything looked OK.

However, when we run "nodetool gossipinfo" we get the following output:

/10.1.32.97
RELEASE_VERSION:1.0.11
SCHEMA:b1116df0-b3dd-11e2-0000-16fe4da5dbff
LOAD:2.76851457173E11
RPC_ADDRESS:0.0.0.0
STATUS:NORMAL,56713727820156410577229101238628035243
/10.128.16.111
REMOVAL_COORDINATOR:REMOVER,113427455640312821154458202477256070486
STATUS:LEFT,42537039300520238181471502256297362072,1369471488145
/10.128.16.110
REMOVAL_COORDINATOR:REMOVER,1
STATUS:LEFT,42537092606577173116506557155915918934,1369471275829
/10.1.32.100
RELEASE_VERSION:1.0.11
SCHEMA:b1116df0-b3dd-11e2-0000-16fe4da5dbff
LOAD:2.75649392881E11
RPC_ADDRESS:0.0.0.0
STATUS:NORMAL,85070591730234615865843651857942052863
/10.1.32.101
RELEASE_VERSION:1.0.11
SCHEMA:b1116df0-b3dd-11e2-0000-16fe4da5dbff
LOAD:2.71158702006E11
RPC_ADDRESS:0.0.0.0
STATUS:NORMAL,141784319550391026443072753096570088105
/10.1.32.98
RELEASE_VERSION:1.0.11
SCHEMA:b1116df0-b3dd-11e2-0000-16fe4da5dbff
LOAD:2.73163150773E11
RPC_ADDRESS:0.0.0.0
STATUS:NORMAL,113427455640312821154458202477256070486
/10.128.16.112
REMOVAL_COORDINATOR:REMOVER,1
STATUS:LEFT,42537092606577173116506557155915918934,1369471567719
/10.1.32.99
RELEASE_VERSION:1.0.11
SCHEMA:b1116df0-b3dd-11e2-0000-16fe4da5dbff
LOAD:2.72271268395E11
RPC_ADDRESS:0.0.0.0
STATUS:NORMAL,28356863910078205288614550619314017621
/10.1.32.96
RELEASE_VERSION:1.0.11
SCHEMA:b1116df0-b3dd-11e2-0000-16fe4da5dbff
LOAD:2.71494331357E11
RPC_ADDRESS:0.0.0.0
STATUS:NORMAL,0

Does anyone know why the dead nodes still appear when we run "nodetool
gossipinfo" but they don't when we run "describe cluster" from the CLI?

Thank you in advance for your help,

Vasilis

Reply via email to