[
https://issues.apache.org/jira/browse/SOLR-12412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16545867#comment-16545867
]
Varun Thacker commented on SOLR-12412:
--------------------------------------
This test class has two methods
* test()
* testOtherReplicasAreNotActive()
Both try creating a collection "collection1" . We should probably put the
delete collection in a finally block. This would avoid the following error
{code:java}
[junit4] 2> 13586 INFO
(TEST-LeaderTragicEventTest.test-seed#[7146D51E1F1D9F1A]) [ ]
o.a.s.SolrTestCaseJ4 ###Starting test
[junit4] 2> 13588 INFO (qtp1687913357-34) [n:127.0.0.1:36827_solr ]
o.a.s.h.a.CollectionsHandler Invoked Collection Action :create with params
collection.configName=config&name=collection1&nrtReplicas=2&action=CREATE&numShards=1&wt=javabin&version=2
and sendToOCPQueue=true
[junit4] 2> 13590 INFO (OverseerThreadFactory-38-thread-1) [ ]
o.a.s.c.a.c.CreateCollectionCmd Create collection collection1
[junit4] 2> 13591 ERROR (OverseerThreadFactory-38-thread-1) [ ]
o.a.s.c.a.c.OverseerCollectionMessageHandler Collection: collection1 operation:
create failed:org.apache.solr.common.SolrException: collection already exists:
collection1
[junit4] 2> at
org.apache.solr.cloud.api.collections.CreateCollectionCmd.call(CreateCollectionCmd.java:106)
[junit4] 2> at
org.apache.solr.cloud.api.collections.OverseerCollectionMessageHandler.processMessage(OverseerCollectionMessageHandler.java:255)
[junit4] 2> at
org.apache.solr.cloud.OverseerTaskProcessor$Runner.run(OverseerTaskProcessor.java:469)
[junit4] 2> at
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:209)
[junit4] 2> at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[junit4] 2> at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[junit4] 2> at java.lang.Thread.run(Thread.java:748){code}
Since testOtherReplicasAreNotActive() failed with an error , it didn't delete
the collection1. test() was run after that and hit the above error. test()
still passed even if the create collection failed ( which means there was
already a corrupted index ) . Sounds fishy?
We could replace this the following line?
{code:java}
- int numReplicas = random().nextInt(2) + 1;
+ int numReplicas = TestUtil.nextInt(random(), 1, 2);{code}
testOtherReplicasAreNotActive() -> When there are two replicas , where are we
actually checking if it becomes active or not after it has been started again?
i.e after this statement should we be checking if it becomes active and fail
the test?
{code:java}
if (otherReplicaJetty != null) {
// won't be able to do anything here, since this replica can't recovery from
the leader
otherReplicaJetty.start();
}{code}
testOtherReplicasAreNotActive() -> when the test selects one replica , what
are we testing exactly ? From what I can understand we are corrupting the
leader of a single sharded collection and then validating if it's still the
leader ?
I'm trying to understand the corruptLeader() method : Why are we trying to
delete segment files after every add ? What if we just add the 100 docs and
then delete the segments_N file ?
Happy to pitch in just wanted to understand the test better before diving in
> Leader should give up leadership when IndexWriter.tragedy occur
> ---------------------------------------------------------------
>
> Key: SOLR-12412
> URL: https://issues.apache.org/jira/browse/SOLR-12412
> Project: Solr
> Issue Type: Improvement
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Cao Manh Dat
> Assignee: Cao Manh Dat
> Priority: Major
> Attachments: SOLR-12412.patch, SOLR-12412.patch,
> jenkins-failure-2325.log
>
>
> When a leader meets some kind of unrecoverable exception (ie:
> CorruptedIndexException). The shard will go into the readable state and human
> has to intervene. In that case, it will be the best if the leader gives up
> its leadership and let other replicas become the leader.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]