[ 
https://issues.apache.org/jira/browse/SOLR-5652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13877683#comment-13877683
 ] 

Hoss Man commented on SOLR-5652:
--------------------------------

http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-MacOSX/1217/
Revision: 1559847
Using Java: 64bit/jdk1.7.0 -XX:+UseCompressedOops -XX:+UseSerialGC


{noformat}
   [junit4]   2> NOTE: reproduce with: ant test  
-Dtestcase=DistribCursorPagingTest -Dtests.method=testDistribSearch 
-Dtests.seed=F09D8E3EF23506C2 -Dtests.slow=true -Dtests.locale=sr_RS 
-Dtests.timezone=Australia/Tasmania -Dtests.file.encoding=UTF-8
   [junit4] FAILURE 21.5s | DistribCursorPagingTest.testDistribSearch <<<
   [junit4]    > Throwable #1: java.lang.AssertionError: walk already seen: 93, 
don't know why; q=id:93 gives: 
{responseHeader={status=0,QTime=7},response={numFound=1,start=0,maxScore=4.555348,docs=[SolrDocument{id=93,
 long=5077, long_last=5077, long_first=5077, long_dv_last=5077, 
long_dv_first=5077, float=-3.6574272E8, float_last=-3.6574272E8, 
float_first=-3.6574272E8, float_dv_last=-3.6574272E8, 
float_dv_first=-3.6574272E8, double=-1.3713607226255326E9, 
double_last=-1.3713607226255326E9, double_first=-1.3713607226255326E9, 
double_dv_last=-1.3713607226255326E9, double_dv_first=-1.3713607226255326E9, 
str=˜˺ʵ, str_last=˜˺ʵ, str_first=˜˺ʵ, str_dv_last=˜˺ʵ, str_dv_first=˜˺ʵ, 
bin=LfuliMaoJJG5866cs8lYmtS89ZDH2owXyi2QPp9kw6zpPlrrT4UAZw==, 
bin_last=LfuliMaoJJG5866cs8lYmtS89ZDH2owXyi2QPp9kw6zpPlrrT4UAZw==, 
bin_first=LfuliMaoJJG5866cs8lYmtS89ZDH2owXyi2QPp9kw6zpPlrrT4UAZw==, 
bin_dv_last=LfuliMaoJJG5866cs8lYmtS89ZDH2owXyi2QPp9kw6zpPlrrT4UAZw==, 
bin_dv_first=LfuliMaoJJG5866cs8lYmtS89ZDH2owXyi2QPp9kw6zpPlrrT4UAZw==, 
_version_=1457794677110472704}]}}
   [junit4]    >        at 
__randomizedtesting.SeedInfo.seed([F09D8E3EF23506C2:717B0026856A66FE]:0)
   [junit4]    >        at 
org.apache.solr.cloud.DistribCursorPagingTest.assertFullWalkNoDups(DistribCursorPagingTest.java:636)
   [junit4]    >        at 
org.apache.solr.cloud.DistribCursorPagingTest.doRandomSortsOnLargeIndex(DistribCursorPagingTest.java:465)
   [junit4]    >        at 
org.apache.solr.cloud.DistribCursorPagingTest.doTest(DistribCursorPagingTest.java:86)
   [junit4]    >        at 
org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:867)
   [junit4]    >        at java.lang.Thread.run(Thread.java:744)
{noformat}

One thing that jumps out at me here is that in this failure the doc in question 
(93) doesn't have a value for the int_dv_last field being sorted on - no 
concrete info on whether that was true in the first failure as well.

The next steps I can think of:
* get the explicit sort criteria into the assertion failure so it's easier to 
verify w/o digging through the logs
* add more logging so every time a doc is seen in one of these cursor walks, we 
log it's sort value -- i'm still not convinced this problem isn't a general 
shard inconsistency problem



> Heisenbug in DistribCursorPagingTest: "walk already seen ..."
> -------------------------------------------------------------
>
>                 Key: SOLR-5652
>                 URL: https://issues.apache.org/jira/browse/SOLR-5652
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Hoss Man
>            Assignee: Hoss Man
>
> Twice now, Uwe's jenkins has encountered a "walk already seen ..." assertion 
> failure from DistribCursorPagingTest that I've been unable to fathom, let 
> alone reproduce.
> Using this as a tracking issue to try and make sense of it.
> Summary of things noticed so far (in 2 failures):
> * So far only seen on http://jenkins.thetaphi.de
> * So far only seen on MacOSX
> * So far only seen on branch 4x
> * So far seen on both Java6 and Java7
> * fails occured in first block of randomized testing: 
> ** we've indexed a small number of randomized docs
> ** we're explicitly looping over every field and sorting in both directions
> * fails were both when sorting on one of the "\*_dv_last desc" fields 
> (docValues=true, sortMissingLast=true) 
> ** sort on same field asc has always worked fine just before this (fields are 
> in arbitrary order, but "asc" always tried before "desc")
> ** sorting on some other random fields has sometimes been tried before this 
> and worked
> (specifics of each failure seen in the wild recorded in comments)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to