You can't have two nodes with the same token (in the current metadata implementation) - it causes problems counting things like how many replicas ACK a write, and what happens if the one you're replacing ACKs a write but the joining host doesn't? It's harder than it seems to maintain consistency guarantees in that model, because you have 2 nodes where either may end up becoming the sole true owner of the token, and you have to handle both cases where one of them fails.
An easier option is to add it with new token set to old token +1 (as an expansion), then decom the leaving node (shrink). That'll minimize streaming when you decommission that node. On Mon, May 8, 2023 at 7:19 PM Runtian Liu <curly...@gmail.com> wrote: > Hi all, > > Sometimes we want to replace a node for various reasons, we can replace a > node by shutting down the old node and letting the new node stream data > from other replicas, but this approach may have availability issues or data > consistency issues if one more node in the same cluster went down. Why > Cassandra doesn't support replacing a node without shutting down the old > one? Can we treat the new node as normal node addition while having exactly > the same token ranges as the node to be replaced. After the new node's > joining process is complete, we just need to cut off the old node. With > this, we don't lose any availability and the token range is not moved so no > clean up is needed. Is there any downside of doing this? > > Thanks, > Runtian >