stillalex commented on PR #1504: URL: https://github.com/apache/solr/pull/1504#issuecomment-1496270123
@dsmiley thanks for taking a look. I think we are looking at the same problem area but different lifecycles. Your summary on #1484 applies here as well: ``` A shard being split (a so-called parent shard) or that which recently completed (thus may have state INACTIVE) receives docs from a client (the test) and forwards to the sub-shards. ``` the difference is you are seeing `ClusterState says we are the leader` while (one of) my errors are `Request says it is coming from parent shard leader but we are in active state`. I think you are seeing a pre-success shard split race window, while I am seeing a post-success shard split. I am wondering if my fix will not cover your case as well. I am attempting to use the `isSubShardLeader` to decide if locally we could apply the leader behavior instead of the follower behavior. it seems to work well for the failures I am looking at. It would be really nice to have a mapping between all possible state transitions, and build a unit-test-like verification for all transition states verifying if all cases are handled and no illegal state can sneak in (like persisting a null-version document). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org