Hi, Digging in Cassandra logs, I saw this exception happening several times, see trace below.
I was wondering, what can cause such big messages, and could this error make a repair session fail? I currently see that my repairs are very long and sometimes I even stop them cause it seems they are just hanging and will never return. For information, I run repair (node+table) by (node+table), 1 after 1 sequentially, and I stop after 5 days stucking on the same (node+table). I see the repair session is still alive, the pid is here, no repair failure message, just nothing happens and repair % is not increasing. Is there something I can do to avoid that error? If I could only fix by changing the configuration, what is the configuration setting to increase? I guess it is commitlog_segment_size_in_mb, but I am not able to find a definite confirmation on the web. Here some informations of the installation: - Apache Cassandra 4.0 GA release - output of cqlsh: cqlsh 6.0.0 | Cassandra 4.0.1 | CQL spec 3.4.5 | Native protocol v5 - OS: CentOS Linux release 8.4.2105 - Output of nodetool status (hiding address column which is sensitive). As you can see cluster remains imbalanced 2 weeks after I joined the 2 last nodes (the ones having the lesser load), before it was perfectly balanced with the same data model, if this information may help: -- Load Tokens Owns (effective) Host ID Rack UN 712.72 GiB 8 32.6% 00f8bb86-5283-4b01-9819-fe4d59337680 rack1 UN 830.91 GiB 8 35.5% b9490ee5-44ba-4898-add7-159a7eeb06d9 rack1 UN 763.69 GiB 8 34.7% 931ccba9-9aef-4d79-9fa0-53e4654554f5 rack1 UN 720.29 GiB 8 33.0% 4b5445ad-fb3c-419e-98b9-80599014e2b4 rack1 UN 273.6 GiB 8 22.8% 34db97d3-6c6d-4544-b656-9d4ae2a82dca rack1 UN 616.92 GiB 8 41.3% 631d7b77-de74-4fe3-86a2-8bd3beec191e rack1 Thank you for your help! Sébastien. -- /var/log/cassandra/system.log:ERROR [AntiEntropyStage:1] 2021-11-05 00:48:28,074 CassandraDaemon.java:579 - Exception in thread Thread[AntiEntropyStage:1,5,main] /var/log/cassandra/system.log-org.apache.cassandra.net.Message$OversizedMessageException: Message of size 140580605 bytes exceeds allowed maximum of 134217728 bytes /var/log/cassandra/system.log- at org.apache.cassandra.net.OutboundConnection.enqueue(OutboundConnection.java:328) /var/log/cassandra/system.log- at org.apache.cassandra.net.OutboundConnections.enqueue(OutboundConnections.java:84) /var/log/cassandra/system.log- at org.apache.cassandra.net.MessagingService.doSend(MessagingService.java:338) /var/log/cassandra/system.log- at org.apache.cassandra.net.OutboundSink.accept(OutboundSink.java:70) /var/log/cassandra/system.log- at org.apache.cassandra.net.MessagingService.send(MessagingService.java:327) /var/log/cassandra/system.log- at org.apache.cassandra.net.MessagingService.send(MessagingService.java:314) /var/log/cassandra/system.log- at org.apache.cassandra.repair.Validator.respond(Validator.java:269) /var/log/cassandra/system.log- at org.apache.cassandra.repair.Validator.run(Validator.java:257) /var/log/cassandra/system.log- at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) /var/log/cassandra/system.log- at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) /var/log/cassandra/system.log- at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) /var/log/cassandra/system.log- at java.lang.Thread.run(Thread.java:748)