[ 
https://issues.apache.org/jira/browse/CASSANDRA-20610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17948651#comment-17948651
 ] 

Chris Miller commented on CASSANDRA-20610:
------------------------------------------

We are facing intermittent issues on our production cassandra cluster (4.1.2) 
whereby the instance becomes unavailable to the application and nodetool 
describecluster shows all other instances in the cluster as unreachable, 
whereas nodetool status show all instances as UN.

We initially see the following reported on some of the "remote" instances:

 
{code:java}
INFO [Messaging-EventLoop-3-9] 2025-04-25 09:31:53,230 
OutboundConnection.java:1059 - 
/xx.xx.xx.130:7000->/xx.xx.xx.15:7000-SMALL_MESSAGES-e3a8f8e3 channel closed by 
provider{code}
 

And a corresponding entry on the "problem" instance:

 
{code:java}
ERROR [Messaging-EventLoop-3-13] 2025-04-25 09:31:54,041 
InboundMessageHandler.java:298 - 
/xx.xx.xx.130:7000->/xx.xx.xx.15:7000-SMALL_MESSAGES-fe780889 unexpected 
exception caught while processing inbound messages; terminating connection
io.netty.channel.unix.Errors$NativeIoException: readAddress(..) failed: 
Connection reset by peer
{code}
 

We also start seeing operation time outs:
{code:java}
INFO  [Native-Transport-Requests-51] 2025-04-25 09:33:58,404 
NoSpamLogger.java:105 - "Operation timed out - received only 0 responses." 
while executing SELECT * FROM system_auth.roles WHERE role = 'xxxx' ALLOW 
FILTERING{code}
Our sysadmins have confirmed that there were no network issues at that time.

Restarting the cassandra instance resolves the issue.

Any ideas on what might be causing this to happen?

Any tests you'd like me to complete next time this happens?

Thanks, 

Chris.

 

 

> nodetool unreachable shows all other instances down
> ---------------------------------------------------
>
>                 Key: CASSANDRA-20610
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-20610
>             Project: Apache Cassandra
>          Issue Type: Bug
>            Reporter: Chris Miller
>            Priority: Normal
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to