Hi, all,
We are trying to reason the possible scenarios when a C*(v1.x) cluster connection keeps flapping in production. (Two node cluster, each node keeps marking the other node DOWN but came back UP within seconds; multiple times a day) We have checked the load on the cluster i- very light and low GC activities also. We have also checked the network interface / devices were working just fine on the nodes during the incidence. We are changing our investigation direction to the network topology/settings, so we are thinking to capture gossip heartbeat packet to verify if the packet is received as expected on the other end. Has anyone tried to capture the packet of gossip internode communication? What would be the filter / criteria to grep heartbeat-related packet only? Thanks in advance! Michael