io.netty.channel.unix.Errors$NativeIoException: Connection reset by peer

2021-04-26 Thread velocix cephusers
Hi,

We are having a 5 node Cassandra cluster running in version 3.0.13.
Recently we upgrade the Cassandra cpp driver on the application side
from cassandra-cpp-driver-2.6.0-1.el7.centos.x86_64
to cassandra-cpp-driver-2.15.3-1.el7.x86_64. After the upgrade, Cassandra
system.log is continuously filled with the below message.

INFO  [SharedPool-Worker-11] 2021-04-26 07:11:18,445 Message.java:615 -
Unexpected exception during request; channel = [id: 0x08a9bc0f, L:/
10.50.11.123:9042 ! R:/10.50.11.182:44734]
io.netty.channel.unix.Errors$NativeIoException: syscall:read(...)() failed:
Connection reset by peer
at io.netty.channel.unix.FileDescriptor.readAddress(...)(Unknown
Source) ~[netty-all-4.0.44.Final.jar:4.0.44.Final]

Even though these INFO level log prints are causing system.log to be
rotated 4 or 5 times a day, there is no functional impact seen.

What could be the problem here? Let me know if more details are needed.

Regards,
Renoy Paulose


Re: io.netty.channel.unix.Errors$NativeIoException: Connection reset by peer

2021-04-26 Thread sunil pawar
Hi Renoy,

The message below shows there are connection hick-ups between Remote and
Local machines.
   Unexpected exception during request; channel = [id: 0x08a9bc0f, L:/
10.50.11.123:9042 ! R:/10.50.11.182:44734]

L stands for local machine address and R stands for Remote machine address.
Please check if there are any connection issues or if any firewall
restrictions between two machines.
As you have upgraded the driver recently, I will suggest checking if any
configuration is missing.

Thanks,
Sunil Pawar


On Mon, Apr 26, 2021 at 12:50 PM velocix cephusers <
velocixcephus...@gmail.com> wrote:

> Hi,
>
> We are having a 5 node Cassandra cluster running in version 3.0.13.
> Recently we upgrade the Cassandra cpp driver on the application side
> from cassandra-cpp-driver-2.6.0-1.el7.centos.x86_64
> to cassandra-cpp-driver-2.15.3-1.el7.x86_64. After the upgrade, Cassandra
> system.log is continuously filled with the below message.
>
> INFO  [SharedPool-Worker-11] 2021-04-26 07:11:18,445 Message.java:615 -
> Unexpected exception during request; channel = [id: 0x08a9bc0f, L:/
> 10.50.11.123:9042 ! R:/10.50.11.182:44734]
> io.netty.channel.unix.Errors$NativeIoException: syscall:read(...)()
> failed: Connection reset by peer
> at io.netty.channel.unix.FileDescriptor.readAddress(...)(Unknown
> Source) ~[netty-all-4.0.44.Final.jar:4.0.44.Final]
>
> Even though these INFO level log prints are causing system.log to be
> rotated 4 or 5 times a day, there is no functional impact seen.
>
> What could be the problem here? Let me know if more details are needed.
>
> Regards,
> Renoy Paulose
>


Re: io.netty.channel.unix.Errors$NativeIoException: Connection reset by peer

2021-04-26 Thread Erick Ramirez
That message gets logged when the node tries to respond back to the client
but the driver has already given up waiting for the cluster to respond so
the connection is no longer active.

It typically happens when running an expensive query and the coordinator is
still waiting for the replicas to respond but the driver already reached
the client-side timeout. It can also happen when the driver has been
configured with a very low timeout value so the coordinator never gets a
chance to respond back.

Check for the timeouts configured on the driver. I'd also recommend
reviewing the app queries for clues. Cheers!

>


Re: counter cache loading very slow

2021-04-26 Thread Kane Wilson
Sounds like you're potentially hitting a bug, maybe even one that hasn't
been hit before. How are you determining it's counters that are the
problem? Is it stalling on the Initializing counters log line or something?

raft.so - Cassandra consulting, support, and managed services


On Mon, Apr 26, 2021 at 3:25 AM Gil Ganz  wrote:

> Hey
> I have a cluster, 3.11.6, startup is very slow, i3en.xlarge server with
> about 1tb of data, takes 45 minutes to startup, almost 40 minutes of that
> is loading the saved counter cache from disk (200mb), and I can see that in
> these 40 minutes the amount of data read from disk is very high, up to
> 700MB/s. Counters is a feature that is used heavily in this environment,
> including the main table in the db, which is half the data size.
>
> What can cause such a slow load of such a small cache? Is there something
> that can be done to make this quicker?
>
> Gil
>