[ https://issues.apache.org/jira/browse/SOLR-16414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17628565#comment-17628565 ]
Patson Luk commented on SOLR-16414: ----------------------------------- Another kinda minor thought is perhaps we should always check `isClosed()` for `ZkCmdExecutor#retryOperation`. So it should not sleep a minimum of 1.5 sec even if connection is closed. It's not major, perhaps the check was there in case `isClosed()` is expensive to call ? https://github.com/apache/solr/blob/main/solr/solrj-zookeeper/src/java/org/apache/solr/common/cloud/ZkCmdExecutor.java#L69 > Race condition in PRS state updates > ----------------------------------- > > Key: SOLR-16414 > URL: https://issues.apache.org/jira/browse/SOLR-16414 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Reporter: Noble Paul > Assignee: Noble Paul > Priority: Major > Fix For: 9.1 > > Time Spent: 40m > Remaining Estimate: 0h > > For PRS collections the individual states are potentially updated from > individual nodes and sometimes from overseer too. it's possible that > > # OP1 is sent to overseer at T1 > # OP2 is executed in the node itself at T2 > > Because we cannot guarantee that the OP1 sent to overseer may execute before > OP2 tyhe final state will be the result of OP1 which is incorrect and can > lead to errors . > The solution is to never do any PRS writes from overseer. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org