[ https://issues.apache.org/jira/browse/SOLR-17557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17897713#comment-17897713 ]
Mark Robert Miller edited comment on SOLR-17557 at 11/12/24 7:37 PM: --------------------------------------------------------------------- That’s a safe, straightforward optimization, but I think it’s also worth considering having the replica just fail its leader attempt if it’s LIR term is behind rather than try and peersync. The upside being, you remove peersync from the equation, less moving parts, and you drop the case where peersync is attempted, but not successful because it’s too far behind. Instead, you just fast track the election to a replica that is up to date. was (Author: markrmiller): That’s a safe, straightforward optimization, but I think it’s also worth considering having the replica just fail its leader attempt if it’s LIR term is being rather than try and peersync. The upside being, you remove peersync from the equation, less moving parts, and you drop the case where peersync is attempted, but not successful because it’s too far behind. Instead, you just fast track the election to a replica that is up to date. > PeerSync should only be called when the ZkShardTerm is not the highest > ---------------------------------------------------------------------- > > Key: SOLR-17557 > URL: https://issues.apache.org/jira/browse/SOLR-17557 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud > Reporter: Houston Putman > Priority: Major > > Currently when a leader is elected for a shard, PeerSync is called after > election to make sure that the new leader is not missing documents that other > replicas have. > With the "new" LeaderInitiatedRecovery (LIR) implementation based on > ZkShardTerms, we now have a much better idea as to which replicas have all > the documents that the old leader had. So if the newly elected leader has the > highest ZkShardTerm (i.e. it was already in sync with the old leader before > the leader election), then we shouldn't need to run PeerSync. > For the break-glass scenario where the newly elected leader does *not* have > the highest ZkShardTerm, then we will probably still want to run PeerSync, > just to be safe, as there will probably be data loss and we want to minimize > how much data that is. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org