[ 
https://issues.apache.org/jira/browse/SOLR-12412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16537479#comment-16537479
 ] 

Steve Rowe commented on SOLR-12412:
-----------------------------------

Policeman Jenkins found a reproducing seed 
[https://jenkins.thetaphi.de/job/Lucene-Solr-7.x-MacOSX/734/] for test failures 
that {{git bisect}} blames on commit {{fddf35c}} on this issue:

{noformat}
Checking out Revision 80eb5da7393dd25c8cb566194eb9158de212bfb2 
(refs/remotes/origin/branch_7x)
[...]
   [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestPullReplica 
-Dtests.method=testKillLeader -Dtests.seed=89003455250E12D2 -Dtests.slow=true 
-Dtests.locale=lg -Dtests.timezone=America/Rainy_River -Dtests.asserts=true 
-Dtests.file.encoding=US-ASCII
   [junit4] FAILURE 60.4s J1 | TestPullReplica.testKillLeader <<<
   [junit4]    > Throwable #1: java.lang.AssertionError: Replica core_node4 not 
up to date after 10 seconds expected:<1> but was:<0>
   [junit4]    >        at 
__randomizedtesting.SeedInfo.seed([89003455250E12D2:C016C0E147B58684]:0)
   [junit4]    >        at 
org.apache.solr.cloud.TestPullReplica.waitForNumDocsInAllReplicas(TestPullReplica.java:542)
   [junit4]    >        at 
org.apache.solr.cloud.TestPullReplica.doTestNoLeader(TestPullReplica.java:490)
   [junit4]    >        at 
org.apache.solr.cloud.TestPullReplica.testKillLeader(TestPullReplica.java:309)
   [junit4]    >        at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   [junit4]    >        at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   [junit4]    >        at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   [junit4]    >        at 
java.base/java.lang.reflect.Method.invoke(Method.java:564)
   [junit4]    >        at java.base/java.lang.Thread.run(Thread.java:844)
[...]
   [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestPullReplica 
-Dtests.method=testRemoveAllWriterReplicas -Dtests.seed=89003455250E12D2 
-Dtests.slow=true -Dtests.locale=lg -Dtests.timezone=America/Rainy_River 
-Dtests.asserts=true -Dtests.file.encoding=US-ASCII
   [junit4] FAILURE 24.6s J1 | TestPullReplica.testRemoveAllWriterReplicas <<<
   [junit4]    > Throwable #1: java.lang.AssertionError: Replica core_node4 not 
up to date after 10 seconds expected:<1> but was:<0>
   [junit4]    >        at 
__randomizedtesting.SeedInfo.seed([89003455250E12D2:1A0EA86E31F0FB7B]:0)
   [junit4]    >        at 
org.apache.solr.cloud.TestPullReplica.waitForNumDocsInAllReplicas(TestPullReplica.java:542)
   [junit4]    >        at 
org.apache.solr.cloud.TestPullReplica.doTestNoLeader(TestPullReplica.java:490)
   [junit4]    >        at 
org.apache.solr.cloud.TestPullReplica.testRemoveAllWriterReplicas(TestPullReplica.java:303)
   [junit4]    >        at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   [junit4]    >        at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   [junit4]    >        at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   [junit4]    >        at 
java.base/java.lang.reflect.Method.invoke(Method.java:564)
   [junit4]    >        at java.base/java.lang.Thread.run(Thread.java:844)
[...]
   [junit4]   2> NOTE: test params are: 
codec=HighCompressionCompressingStoredFields(storedFieldsFormat=CompressingStoredFieldsFormat(compressionMode=HIGH_COMPRESSION,
 chunkSize=8218, maxDocsPerChunk=6, blockSize=10), 
termVectorsFormat=CompressingTermVectorsFormat(compressionMode=HIGH_COMPRESSION,
 chunkSize=8218, blockSize=10)), sim=RandomSimilarity(queryNorm=true): {}, 
locale=lg, timezone=America/Rainy_River
   [junit4]   2> NOTE: Mac OS X 10.11.6 x86_64/Oracle Corporation 9 
(64-bit)/cpus=3,threads=1,free=262884464,total=536870912
{noformat}

> Leader should give up leadership when IndexWriter.tragedy occur
> ---------------------------------------------------------------
>
>                 Key: SOLR-12412
>                 URL: https://issues.apache.org/jira/browse/SOLR-12412
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Cao Manh Dat
>            Assignee: Cao Manh Dat
>            Priority: Major
>         Attachments: SOLR-12412.patch, SOLR-12412.patch
>
>
> When a leader meets some kind of unrecoverable exception (ie: 
> CorruptedIndexException). The shard will go into the readable state and human 
> has to intervene. In that case, it will be the best if the leader gives up 
> its leadership and let other replicas become the leader. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to