[jira] [Updated] (SOLR-10006) Cannot do a full sync (fetchindex) if the replica can't open a searcher

Mike Drob (JIRA) Wed, 25 Jan 2017 13:21:01 -0800

     [ 
https://issues.apache.org/jira/browse/SOLR-10006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Mike Drob updated SOLR-10006:
-----------------------------
    Attachment: SOLR-10006.patch

New patch that fixes your specific issue, however it probably still needs a 
little work.

First, we would probably want to catch EOF and FileNotFound in addition to 
NoSuchFile in IndexWriter.
Second, do we actually want to catch that at IndexWriter? There's a wide range 
of where we can catch and rethrow, and one could reasonably make an argument 
for any of them:

{noformat}
        at 
org.apache.lucene.store.MMapDirectory.openInput(MMapDirectory.java:238)
        at 
org.apache.lucene.store.NRTCachingDirectory.openInput(NRTCachingDirectory.java:192)
        at 
org.apache.solr.core.MetricsDirectoryFactory$MetricsDirectory.openInput(MetricsDirectoryFactory.java:334)
        at 
org.apache.lucene.codecs.lucene50.Lucene50PostingsReader.<init>(Lucene50PostingsReader.java:81)
        at 
org.apache.lucene.codecs.lucene50.Lucene50PostingsFormat.fieldsProducer(Lucene50PostingsFormat.java:442)
        at 
org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.<init>(PerFieldPostingsFormat.java:292)
        at 
org.apache.lucene.codecs.perfield.PerFieldPostingsFormat.fieldsProducer(PerFieldPostingsFormat.java:372)
        at 
org.apache.lucene.index.SegmentCoreReaders.<init>(SegmentCoreReaders.java:109)
        at org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:74)
        at 
org.apache.lucene.index.ReadersAndUpdates.getReader(ReadersAndUpdates.java:143)
        at 
org.apache.lucene.index.ReadersAndUpdates.getReadOnlyClone(ReadersAndUpdates.java:195)
        at 
org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:103)
        at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:473)
        at 
org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:103)
        at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:79)
        at 
org.apache.solr.core.StandardIndexReaderFactory.newReader(StandardIndexReaderFactory.java:39)
        at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1958)
{noformat}

That might be better as a lucene discussion though?

> Cannot do a full sync (fetchindex) if the replica can't open a searcher
> -----------------------------------------------------------------------
>
>                 Key: SOLR-10006
>                 URL: https://issues.apache.org/jira/browse/SOLR-10006
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>    Affects Versions: 5.3.1, 6.4
>            Reporter: Erick Erickson
>         Attachments: SOLR-10006.patch, SOLR-10006.patch, solr.log
>
>
> Doing a full sync or fetchindex requires an open searcher and if you can't 
> open the searcher those operations fail.
> For discussion. I've seen a situation in the field where a replica's index 
> became corrupt. When the node was restarted, the replica tried to do a full 
> sync but fails because the core can't open a searcher. The replica went into 
> an endless sync/fail/sync cycle.
> I couldn't reproduce that exact scenario, but it's easy enough to get into a 
> similar situation. Create a 2x2 collection and index some docs. Then stop one 
> of the instances and go in and remove a couple of segments files and restart.
> The replica stays in the "down" state, fine so far.
> Manually issue a fetchindex. That fails because the replica can't open a 
> searcher. Sure, issuing a fetchindex is abusive.... but I think it's the same 
> underlying issue: why should we care about the state of a replica's current 
> index when we're going to completely replace it anyway?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (SOLR-10006) Cannot do a full sync (fetchindex) if the replica can't open a searcher

Reply via email to