Michael Burman created CASSANDRA-20476: ------------------------------------------
Summary: Cluster is unable to recover after shutdown if IPs change Key: CASSANDRA-20476 URL: https://issues.apache.org/jira/browse/CASSANDRA-20476 Project: Apache Cassandra Issue Type: Bug Components: Transactional Cluster Metadata Reporter: Michael Burman When a cluster is for any reason shutdown in a environment where the IPs can change, the current TCM implementation prevents the cluster from recovering. The previous Gossip system was able to correctly restart after this process, but the first node when starting with TCM will get stuck to trying to find nodes that do not exists anymore and it will prevent starting entirely. What happens is that it spams the following to the logs: {noformat} WARN [InternalResponseStage:218] 2025-03-24 12:31:53,433 RemoteProcessor.java:227 - Got error from /10.244.3.4:7000: TIMEOUT when sending TCM_COMMIT_REQ, retrying on CandidateIterator{candidates=[/10.244.3.4:7000], checkLive=true} WARN [InternalResponseStage:219] 2025-03-24 12:32:03,496 RemoteProcessor.java:227 - Got error from /10.244.3.4:7000: TIMEOUT when sending TCM_COMMIT_REQ, retrying on CandidateIterator{candidates=[/10.244.3.4:7000], checkLive=true} WARN [Messaging-EventLoop-3-3] 2025-03-24 12:32:13,528 NoSpamLogger.java:107 - /10.244.4.8:7000->/10.244.3.4:7000-URGENT_MESSAGES-[no-channel] dropping message of type TCM_COMMIT_REQ whose timeout expired before reaching the network WARN [InternalResponseStage:220] 2025-03-24 12:32:13,529 RemoteProcessor.java:227 - Got error from /10.244.3.4:7000: TIMEOUT when sending TCM_COMMIT_REQ, retrying on CandidateIterator{candidates=[/10.244.3.4:7000], checkLive=true} INFO [Messaging-EventLoop-3-6] 2025-03-24 12:32:23,373 NoSpamLogger.java:104 - /10.244.4.8:7000->/10.244.6.7:7000-URGENT_MESSAGES-[no-channel] failed to connect io.netty.channel.ConnectTimeoutException: connection timed out after 2000 ms: /10.244.6.7:7000 at io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe$2.run(AbstractEpollChannel.java:615) at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:156) at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173) at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166) at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472) at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:408) at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.base/java.lang.Thread.run(Thread.java:829) {noformat} And does not move forward. The node is assigned as its own seed node with the current IP address, which is 10.244.4.8 in this case. {noformat} INFO [main] 2025-03-24 11:55:16,938 InboundConnectionInitiator.java:165 - Listening on address: (/10.244.4.8:7000), nic: eth0, encryption: unencrypted {noformat} However as seen from nodetool, it has no idea of such: {noformat} [cassandra@cluster1-dc1-r1-sts-0 /]$ nodetool status Datacenter: dc1 =============== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack DN 10.244.4.7 ? 16 64.7% 6d194555-f6eb-41d0-c000-000000000001 r1 DN 10.244.6.7 ? 16 59.3% 6d194555-f6eb-41d0-c000-000000000002 r2 DN 10.244.3.4 ? 16 76.0% 6d194555-f6eb-41d0-c000-000000000003 r3 [cassandra@cluster1-dc1-r1-sts-0 /]$ nodetool cms Cluster Metadata Service: Members: /10.244.3.4:7000 Needs reconfiguration: false Is Member: false Service State: REMOTE Is Migrating: false Epoch: 24 Local Pending Count: 0 Commits Paused: false Replication factor: ReplicationParams{class=org.apache.cassandra.locator.MetaStrategy, dc1=1} [cassandra@cluster1-dc1-r1-sts-0 /]$ {noformat} It will also not start listening on port 9042. It will forever wait others, not understanding its own IP address has changed. Since this happens to all nodes, the entire cluster is basically dead. In this configuration I used 3 racks, 3 nodes system and simply stopped the cluster in Kubernetes. initial_location_provider was RackDCFileLocationProvider and node_proximity: NetworkTopologyProximity as these should function like GossipingPropertyFileSnitch (according to the documentation). This functionality works fine in older gossiping implementations. The IPs on Kubernetes deployment change everytime the pod is deleted, so assuming any sort of static IPs is not going to work and would be serious downgrade from 5.0. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org