I have a Cassandra 3.11.2 cluster with two DCs. While running repair , I am
observing the following behavior.

I am seeing that node is not able to receive merkle tree from one or two
nodes. Also I am able to see that the missing nodes did send the merkle
tree but it was not received. This make repair hangs on consistent basis.
In netstats I can see output as follows

*Mode: NORMAL*
*Not sending any streams. Attempted: 7858888*
*Mismatch (Blocking): 2560*
*Mismatch (Background): 17173*
*Pool Name Active Pending Completed Dropped*
*Large messages n/a 0 6313 3*
*Small messages n/a 0 55978004 3*
*Gossip messages n/a 0 93756 125**Does it represent network issues? In
Debug logs I saw something*DEBUG
[MessagingService-Outgoing-hostname/xxx.yy.zz.kk-Large] 2022-01-14
05:00:19,031 OutboundTcpConnection.java:349 - Error writing to
hostname/xxx.yy.zz.kk
java.io.IOException: Connection timed out
at sun.nio.ch.FileDispatcherImpl.write0(Native Method) ~[na:1.8.0_221]
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
~[na:1.8.0_221]
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) ~[na:1.8.0_221]
at sun.nio.ch.IOUtil.write(IOUtil.java:65) ~[na:1.8.0_221]
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471)
~[na:1.8.0_221]
at java.nio.channels.Channels.writeFullyImpl(Channels.java:78)
~[na:1.8.0_221]
at java.nio.channels.Channels.writeFully(Channels.java:98) ~[na:1.8.0_221]
at java.nio.channels.Channels.access$000(Channels.java:61) ~[na:1.8.0_221]
at java.nio.channels.Channels$1.write(Channels.java:174) ~[na:1.8.0_221]
at
net.jpountz.lz4.LZ4BlockOutputStream.flushBufferedData(LZ4BlockOutputStream.java:205)
~[lz4-1.3.0.jar:na]
at
net.jpountz.lz4.LZ4BlockOutputStream.write(LZ4BlockOutputStream.java:158)
~[lz4-1.3.0.jar:na] (edited)

Does this show any network fluctuations?

Regards
Manish

Reply via email to