Megan Carey created SOLR-15386: ---------------------------------- Summary: Internal DOWNNODE request will mark replicas down even if their host node is now live Key: SOLR-15386 URL: https://issues.apache.org/jira/browse/SOLR-15386 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Components: SolrCloud Affects Versions: 8.6 Reporter: Megan Carey
When a node is shutting down, it calls into: # [CoreContainer.shutdown()|https://github.com/apache/lucene-solr/blob/branch_8_8/solr/core/src/java/org/apache/solr/core/CoreContainer.java#L1026] # [ZkController.preClose()|https://github.com/apache/lucene-solr/blob/branch_8_8/solr/core/src/java/org/apache/solr/cloud/ZkController.java#L612] # [ZkController.publishNodeAsDown|https://github.com/apache/lucene-solr/blob/branch_8_8/solr/core/src/java/org/apache/solr/cloud/ZkController.java#L2753] This sends a request to Overseer to mark all of the replicas DOWN for the soon-to-be down node. # [Overseer.processMessage()|https://github.com/apache/lucene-solr/blob/branch_8_8/solr/core/src/java/org/apache/solr/cloud/Overseer.java#L459] # [NodeMutator.downNode()|https://github.com/apache/lucene-solr/blob/branch_8_8/solr/core/src/java/org/apache/solr/cloud/overseer/NodeMutator.java#L48] The issue we encountered was as follows: # Solr node shuts down # DOWNNODE message is enqueued for Overseer # Solr node comes back up (running on K8s, so a new node is auto-started as soon as the old node was detected as down) # DOWNNODE was dequeued for processing, and marked all replicas DOWN for the node that is now live. The only place where these replicas would later be marked ACTIVE again is after ShardLeaderElection, but we did not reach that case. An easy fix is to add a check for node liveness prior to marking replicas down, but a lot of tests fail with this change. Was this the intended functionality? -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org