Nodes are going down due to Out of Memory and we are using 31GB heap size in DC1 , however in DC2 (Which serves the traffic) has 16GB heap . Why we had to increase heap in DC1 is because , DC1 nodes were going down due Out of Memory issue but DC2 nodes never went down .
We also noticed below kind of messages in system.log FailureDetector.java:288 - Not marking nodes down due to local pause of 9532654114 > 5000000000 On Tue, Feb 25, 2020 at 9:43 PM Erick Ramirez <erick.rami...@datastax.com> wrote: > What's the reason for nodes going down? Is it because the cluster is > overloaded? Hints will get handed off periodically when nodes come back to > life but if they happen to go down again or become unresponsive (for > whatever reason), the handoff will be delayed until the next cycle. I think > it's every 5 minutes but don't quote me. > > Hinted MV updates can be problematic so it is a symptom but with limited > info, I'm not sure that it's the cause for slow handoffs. Cheers! > >>