[ https://issues.apache.org/jira/browse/CASSANDRA-20476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sam Tunnicliffe reassigned CASSANDRA-20476: ------------------------------------------- Assignee: Sam Tunnicliffe > Cluster is unable to recover after shutdown if IPs change > --------------------------------------------------------- > > Key: CASSANDRA-20476 > URL: https://issues.apache.org/jira/browse/CASSANDRA-20476 > Project: Apache Cassandra > Issue Type: Bug > Components: Transactional Cluster Metadata > Reporter: Michael Burman > Assignee: Sam Tunnicliffe > Priority: Normal > Fix For: 5.x > > > When a cluster is for any reason shutdown in a environment where the IPs can > change, the current TCM implementation prevents the cluster from recovering. > The previous Gossip system was able to correctly restart after this process, > but the first node when starting with TCM will get stuck to trying to find > nodes that do not exists anymore and it will prevent starting entirely. > What happens is that it spams the following to the logs: > {noformat} > WARN [InternalResponseStage:218] 2025-03-24 12:31:53,433 > RemoteProcessor.java:227 - Got error from /10.244.3.4:7000: TIMEOUT when > sending TCM_COMMIT_REQ, retrying on > CandidateIterator{candidates=[/10.244.3.4:7000], checkLive=true} > WARN [InternalResponseStage:219] 2025-03-24 12:32:03,496 > RemoteProcessor.java:227 - Got error from /10.244.3.4:7000: TIMEOUT when > sending TCM_COMMIT_REQ, retrying on > CandidateIterator{candidates=[/10.244.3.4:7000], checkLive=true} > WARN [Messaging-EventLoop-3-3] 2025-03-24 12:32:13,528 NoSpamLogger.java:107 > - /10.244.4.8:7000->/10.244.3.4:7000-URGENT_MESSAGES-[no-channel] dropping > message of type TCM_COMMIT_REQ whose timeout expired before reaching the > network > WARN [InternalResponseStage:220] 2025-03-24 12:32:13,529 > RemoteProcessor.java:227 - Got error from /10.244.3.4:7000: TIMEOUT when > sending TCM_COMMIT_REQ, retrying on > CandidateIterator{candidates=[/10.244.3.4:7000], checkLive=true} > INFO [Messaging-EventLoop-3-6] 2025-03-24 12:32:23,373 NoSpamLogger.java:104 > - /10.244.4.8:7000->/10.244.6.7:7000-URGENT_MESSAGES-[no-channel] failed to > connect > io.netty.channel.ConnectTimeoutException: connection timed out after 2000 ms: > /10.244.6.7:7000 > at > io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe$2.run(AbstractEpollChannel.java:615) > at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) > at > io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:156) > at > io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173) > at > io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166) > at > io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472) > at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:408) > at > io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) > at > io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) > at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > at java.base/java.lang.Thread.run(Thread.java:829) > {noformat} > And does not move forward. The node is assigned as its own seed node with the > current IP address, which is 10.244.4.8 in this case. > {noformat} > INFO [main] 2025-03-24 11:55:16,938 InboundConnectionInitiator.java:165 - > Listening on address: (/10.244.4.8:7000), nic: eth0, encryption: unencrypted > {noformat} > However as seen from nodetool, it has no idea of such: > {noformat} > [cassandra@cluster1-dc1-r1-sts-0 /]$ nodetool status > Datacenter: dc1 > =============== > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- Address Load Tokens Owns (effective) Host ID > Rack > DN 10.244.4.7 ? 16 64.7% > 6d194555-f6eb-41d0-c000-000000000001 r1 > DN 10.244.6.7 ? 16 59.3% > 6d194555-f6eb-41d0-c000-000000000002 r2 > DN 10.244.3.4 ? 16 76.0% > 6d194555-f6eb-41d0-c000-000000000003 r3 > [cassandra@cluster1-dc1-r1-sts-0 /]$ nodetool cms > Cluster Metadata Service: > Members: /10.244.3.4:7000 > Needs reconfiguration: false > Is Member: false > Service State: REMOTE > Is Migrating: false > Epoch: 24 > Local Pending Count: 0 > Commits Paused: false > Replication factor: > ReplicationParams{class=org.apache.cassandra.locator.MetaStrategy, dc1=1} > [cassandra@cluster1-dc1-r1-sts-0 /]$ > {noformat} > It will also not start listening on port 9042. It will forever wait others, > not understanding its own IP address has changed. Since this happens to all > nodes, the entire cluster is basically dead. > In this configuration I used 3 racks, 3 nodes system and simply stopped the > cluster in Kubernetes. initial_location_provider was > RackDCFileLocationProvider and node_proximity: NetworkTopologyProximity as > these should function like GossipingPropertyFileSnitch (according to the > documentation). This functionality works fine in older gossiping > implementations. > The IPs on Kubernetes deployment change everytime the pod is deleted, so > assuming any sort of static IPs is not going to work and would be serious > downgrade from 5.0. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org