[ https://issues.apache.org/jira/browse/CASSANDRA-19580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17947794#comment-17947794 ]
Michael Semb Wever commented on CASSANDRA-19580: ------------------------------------------------ Could CASSANDRA-20033 have fixed this on 5.0 ? > Unable to contact any seeds with node in hibernate status > --------------------------------------------------------- > > Key: CASSANDRA-19580 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19580 > Project: Apache Cassandra > Issue Type: Bug > Components: Cluster/Gossip > Reporter: Cameron Zemek > Priority: Normal > Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x > > Time Spent: 0.5h > Remaining Estimate: 0h > > We have customer running into the error 'Unable to contact any seeds!' . I > have been able to reproduce this issue if I kill Cassandra as its joining > which will put the node into hibernate status. Once a node is in hibernate it > will no longer receive any SYN messages from other nodes during startup and > as it sends only itself as digest in outbound SYN messages it never receives > any states in any of the ACK replies. So once it gets to the check > `seenAnySeed` in it fails as the endpointStateMap is empty. > > A workaround is copying the system.peers table from other node but this is > less than ideal. I tested modifying maybeGossipToSeed as follows: > {code:java} > /* Possibly gossip to a seed for facilitating partition healing */ > private void maybeGossipToSeed(MessageOut<GossipDigestSyn> prod) > { > int size = seeds.size(); > if (size > 0) > { > if (size == 1 && > seeds.contains(FBUtilities.getBroadcastAddress())) > { > return; > } > if (liveEndpoints.size() == 0) > { > List<GossipDigest> gDigests = prod.payload.gDigests; > if (gDigests.size() == 1 && > gDigests.get(0).endpoint.equals(FBUtilities.getBroadcastAddress())) > { > gDigests = new ArrayList<GossipDigest>(); > GossipDigestSyn digestSynMessage = new > GossipDigestSyn(DatabaseDescriptor.getClusterName(), > > DatabaseDescriptor.getPartitionerName(), > > gDigests); > MessageOut<GossipDigestSyn> message = new > MessageOut<GossipDigestSyn>(MessagingService.Verb.GOSSIP_DIGEST_SYN, > > digestSynMessage, > > GossipDigestSyn.serializer); > sendGossip(message, seeds); > } > else > { > sendGossip(prod, seeds); > } > } > else > { > /* Gossip with the seed with some probability. */ > double probability = seeds.size() / (double) > (liveEndpoints.size() + unreachableEndpoints.size()); > double randDbl = random.nextDouble(); > if (randDbl <= probability) > sendGossip(prod, seeds); > } > } > } > {code} > Only problem is this is the same as SYN from shadow round. It does resolve > the issue however as then receive an ACK with all the states. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org