I've tested a scenario where I wanted to reuse a removed node in a new
cluster with same IP, maybe not very common but anyway, found some
strange behaviour in Gossiper.
Here is what I think/see happening:
- Cassandra 1.1. Three node cluster A, B and C.
- Shutdown node C and remove token for node C.
- Everything looks ok in logs, reporting that node C is removed etc..
- Node A and B still sends Gossip digest about the removed node, but I
guess that's ok since they know about it (Gossiper.endpointStateMap).
- Node C has status removed when checking in JMX console.
- Checked in LocationInfo that Ring only contains token/IP for node A and B.
- Removed system/data tables for C.
- Changed seed on C to point to itself.
- Startup node C, node C only gossips itself and node A and B doesn't
recognize that node C is running, which is correct.
- Restart e.g. node A. Now node A will loose all gossip information
(Gossiper.endpointStateMap) about node C. Node A will request
information from LocationInfo and ask node B
about endpoint states. Node A will receive information from node B
about node C, this will trigger Gossiper.handleMajorStateChange and node
C will be first marked as unreachable
because it's in dead state (removed), node A will try to Gossip
(unreachable endpoints) to node C, which will reply that it's up and
node C becomes incorporated into the "old" cluster again.
Is this a a bug or is it a requirement that if you take a node out of
the cluster you must change IP on the removed node if you want to use it
in another cluster?
Please enlight me.
Regards
/Fredrik