HoustonPutman opened a new pull request, #3205: URL: https://github.com/apache/solr/pull/3205
`TestCoordinatorRole.testNRTRestart()` has been flaky for a long time, and for various reasons. I believe this is the last one. Basically in the last part, it's supposed to turn off the NRT replica and the PULL replica alternating, and keep trying requests that have `shards.preference=NRT`, until a PULL replica is forced to be used to serve the request. The issue was that the pull node is started right before this. So if a very low (< 300 ms) random value is chosen for `serveTogetherTime`, then the Pull replica will fail to recover from the NRT replica leader. Pull replicas do not become active unless they recover on startup. So when the NRT replica is offline, requests will fail because there are no replicas serving the requested shard. The `getHostCoreName` call can handle up to 500 ms of failures, so if `downTime` (another random int) is > 500, or a lower number, because it takes time to startup, then it will exceed the number of allowed errors. The fix here is 2 parts: - The only really necessary fix is waiting for the pull replica to come online before starting to take down other nodes - In order to keep the spirit of the test, I reversed the ordering of the jettys that will be brought down, because otherwise the PULL replica is chosen immediately and we don't have to do any iterations of this loop. Because of this, we need to do the same replica-state-check at the end of the loop, to ensure the above failure scenario doesn't happen during our loop either. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org