Most likely node A has some gossip related problems. You can try purging the gossip state on node A, as per the procedure: https://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_gossip_purge.html .
Yabin On Mon, Oct 3, 2016 at 2:38 AM, Girish Kamarthi < girish.kamar...@stellapps.com> wrote: > Hi All, > > I want to test out a scenario where there is intermittent network issues > on one of the node. > > I've got Cassandra 3.7 cluster of 3 nodes with the keyspace replication > factor of 3. > > All the 3 nodes(node A, node B, node C) are started and are in sync. When > one of the cassandra node went down (node A), I restarted cassandra, the > node A gets in sync with the other nodes B & C. > > Now my question is when one of the node has issues like intermittent > network issues (cassandra is still up and running). Say node A is having > network issues, the nodetool status on the other 2 nodes b & C shows that > the node A is down. > > *Debug.log of Node B & C:* > > DEBUG [GossipTasks:1] 2016-10-03 11:46:18,922 Gossiper.java:337 - > Convicting /10.1.1.4 with status NORMAL - alive false > > When the network is back on the node A the nodetool status shows that the > other nodes are down. > > *Debug.log of Node A:* > > DEBUG [GossipTasks:1] 2016-10-03 11:47:23,613 Gossiper.java:337 - > Convicting /10.1.1.5 with status NORMAL - alive false > > DEBUG [GossipTasks:1] 2016-10-03 11:47:23,614 Gossiper.java:337 - > Convicting /10.1.1.6 with status NORMAL - alive false > > > Below are the configuration changes I made in the cassandra.yaml files. > > Node 01 > > cluster_name: 'Test Cluster' > > num_tokens: 256 > > seed_provider: - class_name: org.apache.cassandra.locator.SimpleSeedProvider > > > parameters: - seeds: "10.1.1.4,10.1.1.5,10.1.1.6" > > listen_address: 10.1.1.4 > > broadcast_address: 10.1.1.4 > > rpc_address: 0.0.0.0 > > broadcast_rpc_address: 10.1.1.4 > > > Node02 > > cluster_name: 'Test Cluster' > > num_tokens: 256 > > seed_provider: - class_name: org.apache.cassandra.locator.SimpleSeedProvider > > > parameters: - seeds: "10.1.1.4,10.1.1.5,10.1.1.6" > > listen_address: 10.1.1.5 > > broadcast_address: 10.1.1.5 > > rpc_address: 0.0.0.0 > > broadcast_rpc_address: 10.1.1.5 > > > Node03 > > cluster_name: 'Test Cluster' > > num_tokens: 256 > > seed_provider: - class_name: org.apache.cassandra.locator.SimpleSeedProvider > > > parameters: - seeds: "10.1.1.4,10.1.1.5,10.1.1.6" > > listen_address: 10.1.1.6 > > broadcast_address: 10.1.1.6 > > rpc_address: 0.0.0.0 > > broadcast_rpc_address: 10.1.1.6 > > > Nodetool status on node A when the network is up shows that the other > nodes are down (DN). > > Nodetool status on the other nodes B & C shows that the node 1 is down (DN) > > How does the handshaking works in this scenario? > > Why the node A is not in sync with the other nodes when the network is up? > > Please give me some inputs on resolving this issue. > > Thanks & Regards, > Girish Kumar Kamarthi > +91-9986427891 >