[ 
https://issues.apache.org/jira/browse/CASSANDRA-20476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe reassigned CASSANDRA-20476:
-------------------------------------------

    Assignee: Sam Tunnicliffe

> Cluster is unable to recover after shutdown if IPs change
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-20476
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-20476
>             Project: Apache Cassandra
>          Issue Type: Bug
>          Components: Transactional Cluster Metadata
>            Reporter: Michael Burman
>            Assignee: Sam Tunnicliffe
>            Priority: Normal
>             Fix For: 5.x
>
>
> When a cluster is for any reason shutdown in a environment where the IPs can 
> change, the current TCM implementation prevents the cluster from recovering. 
> The previous Gossip system was able to correctly restart after this process, 
> but the first node when starting with TCM will get stuck to trying to find 
> nodes that do not exists anymore and it will prevent starting entirely.
> What happens is that it spams the following to the logs:
> {noformat}
> WARN  [InternalResponseStage:218] 2025-03-24 12:31:53,433 
> RemoteProcessor.java:227 - Got error from /10.244.3.4:7000: TIMEOUT when 
> sending TCM_COMMIT_REQ, retrying on 
> CandidateIterator{candidates=[/10.244.3.4:7000], checkLive=true}
> WARN  [InternalResponseStage:219] 2025-03-24 12:32:03,496 
> RemoteProcessor.java:227 - Got error from /10.244.3.4:7000: TIMEOUT when 
> sending TCM_COMMIT_REQ, retrying on 
> CandidateIterator{candidates=[/10.244.3.4:7000], checkLive=true}
> WARN  [Messaging-EventLoop-3-3] 2025-03-24 12:32:13,528 NoSpamLogger.java:107 
> - /10.244.4.8:7000->/10.244.3.4:7000-URGENT_MESSAGES-[no-channel] dropping 
> message of type TCM_COMMIT_REQ whose timeout expired before reaching the 
> network
> WARN  [InternalResponseStage:220] 2025-03-24 12:32:13,529 
> RemoteProcessor.java:227 - Got error from /10.244.3.4:7000: TIMEOUT when 
> sending TCM_COMMIT_REQ, retrying on 
> CandidateIterator{candidates=[/10.244.3.4:7000], checkLive=true}
> INFO  [Messaging-EventLoop-3-6] 2025-03-24 12:32:23,373 NoSpamLogger.java:104 
> - /10.244.4.8:7000->/10.244.6.7:7000-URGENT_MESSAGES-[no-channel] failed to 
> connect
> io.netty.channel.ConnectTimeoutException: connection timed out after 2000 ms: 
> /10.244.6.7:7000
>         at 
> io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe$2.run(AbstractEpollChannel.java:615)
>         at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98)
>         at 
> io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:156)
>         at 
> io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173)
>         at 
> io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166)
>         at 
> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472)
>         at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:408)
>         at 
> io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
>         at 
> io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
>         at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>         at java.base/java.lang.Thread.run(Thread.java:829)
> {noformat}
> And does not move forward. The node is assigned as its own seed node with the 
> current IP address, which is 10.244.4.8 in this case. 
> {noformat}
> INFO  [main] 2025-03-24 11:55:16,938 InboundConnectionInitiator.java:165 - 
> Listening on address: (/10.244.4.8:7000), nic: eth0, encryption: unencrypted
> {noformat}
> However as seen from nodetool, it has no idea of such:
> {noformat}
> [cassandra@cluster1-dc1-r1-sts-0 /]$ nodetool status
> Datacenter: dc1
> ===============
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address     Load  Tokens  Owns (effective)  Host ID                       
>         Rack
> DN  10.244.4.7  ?     16      64.7%             
> 6d194555-f6eb-41d0-c000-000000000001  r1  
> DN  10.244.6.7  ?     16      59.3%             
> 6d194555-f6eb-41d0-c000-000000000002  r2  
> DN  10.244.3.4  ?     16      76.0%             
> 6d194555-f6eb-41d0-c000-000000000003  r3  
> [cassandra@cluster1-dc1-r1-sts-0 /]$ nodetool cms
> Cluster Metadata Service:
> Members: /10.244.3.4:7000
> Needs reconfiguration: false
> Is Member: false
> Service State: REMOTE
> Is Migrating: false
> Epoch: 24
> Local Pending Count: 0
> Commits Paused: false
> Replication factor: 
> ReplicationParams{class=org.apache.cassandra.locator.MetaStrategy, dc1=1}
> [cassandra@cluster1-dc1-r1-sts-0 /]$
> {noformat}
> It will also not start listening on port 9042. It will forever wait others, 
> not understanding its own IP address has changed. Since this happens to all 
> nodes, the entire cluster is basically dead. 
> In this configuration I used 3 racks, 3 nodes system and simply stopped the 
> cluster in Kubernetes. initial_location_provider was 
> RackDCFileLocationProvider and node_proximity: NetworkTopologyProximity as 
> these should function like GossipingPropertyFileSnitch (according to the 
> documentation). This functionality works fine in older gossiping 
> implementations. 
> The IPs on Kubernetes deployment change everytime the pod is deleted, so 
> assuming any sort of static IPs is not going to work and would be serious 
> downgrade from 5.0.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to