[ 
https://issues.apache.org/jira/browse/SOLR-13237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18046442#comment-18046442
 ] 

Jan Høydahl commented on SOLR-13237:
------------------------------------

I hardened the test a bit in SOLR-18025 and 
[https://github.com/apache/solr/pull/3939] by retrying to reliably trigger 
tragedy.

But there is still class-level failures during test teardown: 
"java.lang.Exception: Error shutting down MiniSolrCloudCluster" caused by some 
Jetty instances not shutting down within 30s. Likely due to some thread stuck 
waiting for Zk, [~dsmiley] can comment more. Should we somehow instrument the 
test or the JettySolrRunner to do extra logging or triggering a thread-dump 
after waiting 30s for Jetty to stop, instead of just doing a kill? Then we 
would collect more insight into what situations where we do not shut down and 
perhaps do not give up leadership. 

> Not all types of index corruption garuntee a leader will "give up its 
> leadership"
> ---------------------------------------------------------------------------------
>
>                 Key: SOLR-13237
>                 URL: https://issues.apache.org/jira/browse/SOLR-13237
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Chris M. Hostetter
>            Priority: Major
>         Attachments: SOLR-13237_logging.patch, log-fail-5D803D4699663918.txt, 
> log-fail-DEADBEEF.txt, log-pass-BEEFBEEF.txt, log-pass-FEEDBEEF.txt
>
>
> While investigating failures from LeaderTragicEventTest, I've found some 
> reproducible situations where (externally introduced) index corruption can 
> cause a leader to reject updates, but not automatically give up it's 
> leadership.
> See discussion in LUCENE-8692 – notably simon's comment on why/how some 
> things are explicitly not treated as tragic today for a disucssion of the 
> root cause.
> *We may need/want to rethink & improve the situations where a leader gives up 
> leadership, above and beyond IW registering a tragic exception*
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to