Hi, all,

We are trying to reason the possible scenarios when a C*(v1.x) cluster 
connection keeps flapping in production. (Two node cluster, each node keeps 
marking the other node DOWN but came back UP within seconds; multiple times a 
day) We have checked the load on the cluster i- very light and low GC 
activities also. We have also checked the network interface / devices were 
working just fine on the nodes during the incidence. We are changing our 
investigation direction to the network topology/settings, so we are thinking to 
capture gossip heartbeat packet to verify if the packet is received as expected 
on the other end.

Has anyone tried to capture the packet of gossip internode communication? What 
would be the filter / criteria to grep heartbeat-related packet only?

Thanks in advance!


Michael

Reply via email to