Hello All, A while ago we had 3 cassandra nodes on Amazon. At some point we decided to buy some servers and deploy cassandra there. The problem is that since then we have a list of dead IPs listed as UNREACHABLE nodes when we run describe cluster on cassandra-cli.
I have seen other posts which describe similar issues, and the bottom line is "it's harmless but if you want to get rid of it do a full cluster restart" (I presume that means a rolling restart - not shut-down the entire cluster right???). Anyway... We also came across another solution: Install "libmx4j-java", uncomment the respective line on "/etc/default/cassandra", restart the node, go to " http://cassandra_node:8081/mbean?objectname=org.apache.cassandra.net%3Atype%3DGossiper", type in the dead IP/IPs next to the "unsafeAssassinateEndpoint" and invoke it. So we did that on one of the nodes for the list of dead IPs. After running "describe cluster" on the CLI on every node, we noticed that there were no UNREACHABLE nodes and everything looked OK. However, when we run "nodetool gossipinfo" we get the following output: /10.1.32.97 RELEASE_VERSION:1.0.11 SCHEMA:b1116df0-b3dd-11e2-0000-16fe4da5dbff LOAD:2.76851457173E11 RPC_ADDRESS:0.0.0.0 STATUS:NORMAL,56713727820156410577229101238628035243 /10.128.16.111 REMOVAL_COORDINATOR:REMOVER,113427455640312821154458202477256070486 STATUS:LEFT,42537039300520238181471502256297362072,1369471488145 /10.128.16.110 REMOVAL_COORDINATOR:REMOVER,1 STATUS:LEFT,42537092606577173116506557155915918934,1369471275829 /10.1.32.100 RELEASE_VERSION:1.0.11 SCHEMA:b1116df0-b3dd-11e2-0000-16fe4da5dbff LOAD:2.75649392881E11 RPC_ADDRESS:0.0.0.0 STATUS:NORMAL,85070591730234615865843651857942052863 /10.1.32.101 RELEASE_VERSION:1.0.11 SCHEMA:b1116df0-b3dd-11e2-0000-16fe4da5dbff LOAD:2.71158702006E11 RPC_ADDRESS:0.0.0.0 STATUS:NORMAL,141784319550391026443072753096570088105 /10.1.32.98 RELEASE_VERSION:1.0.11 SCHEMA:b1116df0-b3dd-11e2-0000-16fe4da5dbff LOAD:2.73163150773E11 RPC_ADDRESS:0.0.0.0 STATUS:NORMAL,113427455640312821154458202477256070486 /10.128.16.112 REMOVAL_COORDINATOR:REMOVER,1 STATUS:LEFT,42537092606577173116506557155915918934,1369471567719 /10.1.32.99 RELEASE_VERSION:1.0.11 SCHEMA:b1116df0-b3dd-11e2-0000-16fe4da5dbff LOAD:2.72271268395E11 RPC_ADDRESS:0.0.0.0 STATUS:NORMAL,28356863910078205288614550619314017621 /10.1.32.96 RELEASE_VERSION:1.0.11 SCHEMA:b1116df0-b3dd-11e2-0000-16fe4da5dbff LOAD:2.71494331357E11 RPC_ADDRESS:0.0.0.0 STATUS:NORMAL,0 Does anyone know why the dead nodes still appear when we run "nodetool gossipinfo" but they don't when we run "describe cluster" from the CLI? Thank you in advance for your help, Vasilis