[jira] [Commented] (SOLR-4260) Inconsistent numDocs between leader and replica

Mark Miller (JIRA) Fri, 06 Dec 2013 15:38:01 -0800

    [ 
https://issues.apache.org/jira/browse/SOLR-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841896#comment-13841896
 ]


Mark Miller commented on SOLR-4260:
-----------------------------------

I've fixed some things since 4.6 - I only had time to focus on the leader not 
going down case for 4.6, I spent a bunch more time on this case after 4.6 was 
released. Unfortunately, I think there are a couple of issues at play here - 
some of the new changes makes existing holes easier to spot and the chaos 
monkey tests where accidentally disabled for some time, so small issues may 
have crept in.

I *think* the remaining issue is mostly around SOLR-5516. Need to come up with 
a better idea than a really long wait though - but if someone wants to help 
test, putting in a long wait and stressing this would be useful to see if it is 
indeed the main remaining issue.

I recently put in a lot of time improving the situation and I need to focus on 
other things for a bit, but that I'll keep coming back to this as I can.

> Inconsistent numDocs between leader and replica
> -----------------------------------------------
>
>                 Key: SOLR-4260
>                 URL: https://issues.apache.org/jira/browse/SOLR-4260
>             Project: Solr
>          Issue Type: Bug
>          Components: SolrCloud
>         Environment: 5.0.0.2013.01.04.15.31.51
>            Reporter: Markus Jelsma
>            Assignee: Mark Miller
>            Priority: Critical
>             Fix For: 5.0, 4.7
>
>         Attachments: 192.168.20.102-replica1.png, 
> 192.168.20.104-replica2.png, clusterstate.png
>
>
> After wiping all cores and reindexing some 3.3 million docs from Nutch using 
> CloudSolrServer we see inconsistencies between the leader and replica for 
> some shards.
> Each core hold about 3.3k documents. For some reason 5 out of 10 shards have 
> a small deviation in then number of documents. The leader and slave deviate 
> for roughly 10-20 documents, not more.
> Results hopping ranks in the result set for identical queries got my 
> attention, there were small IDF differences for exactly the same record 
> causing a record to shift positions in the result set. During those tests no 
> records were indexed. Consecutive catch all queries also return different 
> number of numDocs.
> We're running a 10 node test cluster with 10 shards and a replication factor 
> of two and frequently reindex using a fresh build from trunk. I've not seen 
> this issue for quite some time until a few days ago.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-4260) Inconsistent numDocs between leader and replica

Reply via email to