io.netty.channel.unix.Errors$NativeIoException: Connection reset by peer
Hi, We are having a 5 node Cassandra cluster running in version 3.0.13. Recently we upgrade the Cassandra cpp driver on the application side from cassandra-cpp-driver-2.6.0-1.el7.centos.x86_64 to cassandra-cpp-driver-2.15.3-1.el7.x86_64. After the upgrade, Cassandra system.log is continuously filled with the below message. INFO [SharedPool-Worker-11] 2021-04-26 07:11:18,445 Message.java:615 - Unexpected exception during request; channel = [id: 0x08a9bc0f, L:/ 10.50.11.123:9042 ! R:/10.50.11.182:44734] io.netty.channel.unix.Errors$NativeIoException: syscall:read(...)() failed: Connection reset by peer at io.netty.channel.unix.FileDescriptor.readAddress(...)(Unknown Source) ~[netty-all-4.0.44.Final.jar:4.0.44.Final] Even though these INFO level log prints are causing system.log to be rotated 4 or 5 times a day, there is no functional impact seen. What could be the problem here? Let me know if more details are needed. Regards, Renoy Paulose
Re: io.netty.channel.unix.Errors$NativeIoException: Connection reset by peer
Hi Renoy, The message below shows there are connection hick-ups between Remote and Local machines. Unexpected exception during request; channel = [id: 0x08a9bc0f, L:/ 10.50.11.123:9042 ! R:/10.50.11.182:44734] L stands for local machine address and R stands for Remote machine address. Please check if there are any connection issues or if any firewall restrictions between two machines. As you have upgraded the driver recently, I will suggest checking if any configuration is missing. Thanks, Sunil Pawar On Mon, Apr 26, 2021 at 12:50 PM velocix cephusers < velocixcephus...@gmail.com> wrote: > Hi, > > We are having a 5 node Cassandra cluster running in version 3.0.13. > Recently we upgrade the Cassandra cpp driver on the application side > from cassandra-cpp-driver-2.6.0-1.el7.centos.x86_64 > to cassandra-cpp-driver-2.15.3-1.el7.x86_64. After the upgrade, Cassandra > system.log is continuously filled with the below message. > > INFO [SharedPool-Worker-11] 2021-04-26 07:11:18,445 Message.java:615 - > Unexpected exception during request; channel = [id: 0x08a9bc0f, L:/ > 10.50.11.123:9042 ! R:/10.50.11.182:44734] > io.netty.channel.unix.Errors$NativeIoException: syscall:read(...)() > failed: Connection reset by peer > at io.netty.channel.unix.FileDescriptor.readAddress(...)(Unknown > Source) ~[netty-all-4.0.44.Final.jar:4.0.44.Final] > > Even though these INFO level log prints are causing system.log to be > rotated 4 or 5 times a day, there is no functional impact seen. > > What could be the problem here? Let me know if more details are needed. > > Regards, > Renoy Paulose >
Re: io.netty.channel.unix.Errors$NativeIoException: Connection reset by peer
That message gets logged when the node tries to respond back to the client but the driver has already given up waiting for the cluster to respond so the connection is no longer active. It typically happens when running an expensive query and the coordinator is still waiting for the replicas to respond but the driver already reached the client-side timeout. It can also happen when the driver has been configured with a very low timeout value so the coordinator never gets a chance to respond back. Check for the timeouts configured on the driver. I'd also recommend reviewing the app queries for clues. Cheers! >
Re: counter cache loading very slow
Sounds like you're potentially hitting a bug, maybe even one that hasn't been hit before. How are you determining it's counters that are the problem? Is it stalling on the Initializing counters log line or something? raft.so - Cassandra consulting, support, and managed services On Mon, Apr 26, 2021 at 3:25 AM Gil Ganz wrote: > Hey > I have a cluster, 3.11.6, startup is very slow, i3en.xlarge server with > about 1tb of data, takes 45 minutes to startup, almost 40 minutes of that > is loading the saved counter cache from disk (200mb), and I can see that in > these 40 minutes the amount of data read from disk is very high, up to > 700MB/s. Counters is a feature that is used heavily in this environment, > including the main table in the db, which is half the data size. > > What can cause such a slow load of such a small cache? Is there something > that can be done to make this quicker? > > Gil >