[ https://issues.apache.org/jira/browse/SOLR-17652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17923822#comment-17923822 ]
Chris M. Hostetter commented on SOLR-17652: ------------------------------------------- FYI, here's a way to demonstrate the bug using solr's example mode (commands below is from 9x, commands need modified slightly to work on main)... {noformat} ./solr/packaging/build/dev/bin/solr start -e cloud -noprompt # Setup our collection and both types of replicas curl -sS 'http://localhost:8983/solr/admin/collections?action=CREATE&name=techproducts&numShards=1&tlogReplicas=1&createNodeSet=localhost:7574_solr' curl -sS 'http://localhost:8983/solr/admin/collections?action=ADDREPLICA&collection=techproducts&shard=shard1&node=localhost:8983_solr&type=PULL' ./solr/packaging/build/dev/bin/post -c techproducts ./solr/packaging/build/dev/example/exampledocs/*.xml # shut down pod hosting TLOG leader ./solr/packaging/build/dev/bin/solr stop -p 7574 # stop & re-start pod hosting PULL replica (and embedded zk) ./solr/packaging/build/dev/bin/solr stop -p 8983 ./solr/packaging/build/dev/bin/solr start --cloud -p 8983 --solr-home "/home/hossman/lucene/solr/solr/packaging/build/dev/example/cloud/node1/solr" --server-dir "/home/hossman/lucene/solr/solr/packaging/build/dev/server" # Wait ~13 min (until you see an exception like the one above in the logs) # Bring back the TLOG leader pod... ./solr/packaging/build/dev/bin/solr start --cloud -p 7574 --solr-home "/home/hossman/lucene/solr/solr/packaging/build/dev/example/cloud/node2/solr" --server-dir "/home/hossman/lucene/solr/solr/packaging/build/dev/server" -z 127.0.0.1:9983 # PULL replica will still stay DOWN forever {noformat} > PULL replicas can be stuck permemantly in DOWN state if leader election takes > too long > -------------------------------------------------------------------------------------- > > Key: SOLR-17652 > URL: https://issues.apache.org/jira/browse/SOLR-17652 > Project: Solr > Issue Type: Bug > Reporter: Chris M. Hostetter > Assignee: Chris M. Hostetter > Priority: Major > Attachments: SOLR-17652.patch > > > A bug exists in {{ZkController}} that can cause PULL replicas to be > permanently stuck in a DOWN state (such that even a core RELOAD can not fix > it) if that PULL replica was initially loaded during a leader election that > takes a significant amount of time. > > Details to follow in comments -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org