Hello,

I have been bootstrapping 4 new nodes into an existing production cluster. Each node was bootstrapped one at a time, the first 2 completing without errors, but ran into issues with the 3rd one. The 4th node has not been started yet.
On bootstrapping the third node, the data steaming sessions completed 
without issue, but bootstrapping did not finish. The node is stuck in 
JOINING state even 19 hours or so after data streaming completed.
Other reports of this issue seem to be related either to network 
connectivity issues between nodes, or multiple nodes bootstrapping 
simultaneously. I haven't found any evidence of either of these 
situations, no errors or stracktraces in the logs.
I'm just looking for the safest way to proceed - I'm fine with removing 
the hanging node altogether, just looking for confirmation that wouldn't 
leave the cluster in a bad state, and what data points to be looking at 
to gauge the situation.
If removing the node and starting over is OK, is any other maintenance 
on the existing nodes recommended? I've read of people 
scrubbing/rebuilding nodes coming out of this situation, but not sure if 
that's necessary.
Please let me know if any additional info would be helpful.

Thanks!
--
Chris Hornung



Reply via email to