Yes, this is a nasty state. When we get caught in this situation we restart all 'down' nodes one at a time to see if some node takes over the leader state.
If not, the 'active' nodes are restarted one at a time. This will lead to downtime until some node realizes it can and must become leader. Downtime never lasted more than one or two minutes. Good luck! Op do 23 mrt 2023 om 03:54 schreef Walter Underwood <wun...@wunderwood.org>: > I have a shard with replication factor 2. One shard is state:active, the > other is state:down. The active shard is not a leader. Using the > FORCELEADER command to try and get it elected leader doesn’t fix it. We > tried adding another replica, but it is also in state:down, maybe because > there isn’t a leader to replicate from. > > Any ideas on how to unwedge this? > > wunder > Walter Underwood > wun...@wunderwood.org > http://observer.wunderwood.org/ (my blog) > >