[ 
https://issues.apache.org/jira/browse/SOLR-6640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Thacker updated SOLR-6640:
--------------------------------
    Attachment: SOLR-6640.patch

This test passes with the patch - {{ant test  
-Dtestcase=ChaosMonkeySafeLeaderTest -Dtests.method=testDistribSearch 
-Dtests.seed=EDA082CD42EB33E3 -Dtests.slow=true -Dtests.locale=hi_IN 
-Dtests.timezone=Asia/KuchingDtests.asserts=true 
-Dtests.file.encoding=ISO-8859-1}} 

The patch does something very simple - Before we begin to download segment 
files, check against the current commit point which files are extra and remove 
them.

For example -
{code}
-001/jetty3/index.20141216162501726
   [junit4]   2> 22194 T113 C7 P59775 oash.SnapPuller.fetchLatestIndex 
SOLR-6640:: indexDir.listAll() pre remove _0.cfe _0.cfs _0.si _0_1.liv _1.fdt 
_1.fdx segments_1 
   [junit4]   2> 22195 T79 C6 P59766 oasup.LogUpdateProcessor.finish 
[collection1] webapp=/_ path=/update params={wt=javabin&version=2} {add=[0-19 
(1487664160941539328)]} 0 1
   [junit4]   2> 22196 T113 C7 P59775 oash.SnapPuller.fetchLatestIndex 
SOLR-6640:: indexDir.listAll() post remove segments_1 
{code}

So it's these files which are not getting removed when we do IW.rollback that 
were causing the problem - 
{{_0.cfe _0.cfs _0.si _0_1.liv _1.fdt _1.fdx}}

I am yet to figure out whether these files should have been removed by 
IW.rollback() or not? 

> ChaosMonkeySafeLeaderTest failure with CorruptIndexException
> ------------------------------------------------------------
>
>                 Key: SOLR-6640
>                 URL: https://issues.apache.org/jira/browse/SOLR-6640
>             Project: Solr
>          Issue Type: Bug
>          Components: replication (java)
>    Affects Versions: 5.0
>            Reporter: Shalin Shekhar Mangar
>             Fix For: 5.0
>
>         Attachments: Lucene-Solr-5.x-Linux-64bit-jdk1.8.0_20-Build-11333.txt, 
> SOLR-6640.patch, SOLR-6640.patch
>
>
> Test failure found on jenkins:
> http://jenkins.thetaphi.de/job/Lucene-Solr-5.x-Linux/11333/
> {code}
> 1 tests failed.
> REGRESSION:  org.apache.solr.cloud.ChaosMonkeySafeLeaderTest.testDistribSearch
> Error Message:
> shard2 is not consistent.  Got 62 from 
> http://127.0.0.1:57436/collection1lastClient and got 24 from 
> http://127.0.0.1:53065/collection1
> Stack Trace:
> java.lang.AssertionError: shard2 is not consistent.  Got 62 from 
> http://127.0.0.1:57436/collection1lastClient and got 24 from 
> http://127.0.0.1:53065/collection1
>         at 
> __randomizedtesting.SeedInfo.seed([F4B371D421E391CD:7555FFCC56BCF1F1]:0)
>         at org.junit.Assert.fail(Assert.java:93)
>         at 
> org.apache.solr.cloud.AbstractFullDistribZkTestBase.checkShardConsistency(AbstractFullDistribZkTestBase.java:1255)
>         at 
> org.apache.solr.cloud.AbstractFullDistribZkTestBase.checkShardConsistency(AbstractFullDistribZkTestBase.java:1234)
>         at 
> org.apache.solr.cloud.ChaosMonkeySafeLeaderTest.doTest(ChaosMonkeySafeLeaderTest.java:162)
>         at 
> org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:869)
> {code}
> Cause of inconsistency is:
> {code}
> Caused by: org.apache.lucene.index.CorruptIndexException: file mismatch, 
> expected segment id=yhq3vokoe1den2av9jbd3yp8, got=yhq3vokoe1den2av9jbd3yp7 
> (resource=BufferedChecksumIndexInput(MMapIndexInput(path="/mnt/ssd/jenkins/workspace/Lucene-Solr-5.x-Linux/solr/build/solr-core/test/J0/temp/solr.cloud.ChaosMonkeySafeLeaderTest-F4B371D421E391CD-001/tempDir-001/jetty3/index/_1_2.liv")))
>    [junit4]   2>              at 
> org.apache.lucene.codecs.CodecUtil.checkSegmentHeader(CodecUtil.java:259)
>    [junit4]   2>              at 
> org.apache.lucene.codecs.lucene50.Lucene50LiveDocsFormat.readLiveDocs(Lucene50LiveDocsFormat.java:88)
>    [junit4]   2>              at 
> org.apache.lucene.codecs.asserting.AssertingLiveDocsFormat.readLiveDocs(AssertingLiveDocsFormat.java:64)
>    [junit4]   2>              at 
> org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:102)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to