You may find throttle rate of hinted handoff on node and adjust if needed. nodetool gethintedhandoffthrottlekb and you may also set by nodetool sethintedhandoffthrottlekb I would also check disk stats where hints are stored either by sar or iostat. Sincerely,
Aakash Pandhi On Wednesday, February 26, 2020, 04:36:16 PM CST, Laxmikant Upadhyay <laxmikant....@gmail.com> wrote: Is dc1 a simple standby DC? Or you run some operations(e.g. compute for analysis) on the same? Have you found the root cause of the oom? Do you see any specific Cassandra operation (e.g repair) is causing oom?One tip: try upgrading to 3.11.6 as lots of bugs has been fixed since 3.11.0 On Wed, Feb 26, 2020, 9:53 PM Krish Donald <gotomyp...@gmail.com> wrote: Nodes are going down due to Out of Memory and we are using 31GB heap size in DC1 , however in DC2 (Which serves the traffic) has 16GB heap .Why we had to increase heap in DC1 is because , DC1 nodes were going down due Out of Memory issue but DC2 nodes never went down . We also noticed below kind of messages in system.logFailureDetector.java:288 - Not marking nodes down due to local pause of 9532654114 > 5000000000 On Tue, Feb 25, 2020 at 9:43 PM Erick Ramirez <erick.rami...@datastax.com> wrote: What's the reason for nodes going down? Is it because the cluster is overloaded? Hints will get handed off periodically when nodes come back to life but if they happen to go down again or become unresponsive (for whatever reason), the handoff will be delayed until the next cycle. I think it's every 5 minutes but don't quote me. Hinted MV updates can be problematic so it is a symptom but with limited info, I'm not sure that it's the cause for slow handoffs. Cheers!