Re: Exceptions in merge thread (while optimizing) causing problems with subsequent reopens

Michael McCandless Fri, 10 Apr 2009 02:37:51 -0700

Actually it's perfectly fine for two threads to enter that code
fragment (you obtain a write lock to protect the code so that "there
can be only one").


Second off, even if you didn't have your write lock, the code should
still be safe in that no index corruption is possible.  Multiple
threads may call optimize(), commit() etc. on an IndexWriter without
harm.

The reopen code is also "safe" (will not cause corruption), but you
may accidentally have readers that you fail to close, or close readers
that are in-use by in-flight searches).  I'd recommend using the
"lia.admin.SearcherManager" class from the upcoming Lucene in Action
revision (it's in the book's source code, which you can download from
http://www.manning.com/hatcher3/LIAsourcecode.zip) to manage/reopen
the searcher.

But one surefire way to cause index corruption is if two separate
IndexWriters are open on the same index.  This is normally not easy to
do, since Lucene protects itself with the write lock in the index
directory.  So if you 1) turn off this locking (eg use NoLockFactory),
and 2) accidentally allow two writers on once on the same index,
you'll get corruption.

So.... I'm not sure that we've actually explained your corruption?

Mike

On Fri, Apr 10, 2009 at 12:42 AM, Khawaja Shams <kssh...@gmail.com> wrote:
> Mike,
>  I am sorry for wasting your time :). There were indeed two threads that
> were performing this operation. Out of curiosity, which part of this is not
> thread safe? An indexreader reopening while a commit is going on? Thanks
> again for your help.
>
> Regards,
> Khawaja
>
>
>
> On Thu, Apr 9, 2009 at 5:44 PM, Michael McCandless <
> luc...@mikemccandless.com> wrote:
>
>> That code looks right.  Are there multiple threads that may enter it?
>>
>> Can you show the code where you create the IndexWriter, add docs, etc?
>>
>> Can you call IndexWriter.setInfoStream for the entire life of the
>> index, up until when the optimize error happens, and post back?
>>
>> Mike
>>
>> On Thu, Apr 9, 2009 at 8:33 PM, Khawaja Shams <kssh...@gmail.com> wrote:
>> > Hi Michael,
>> >  Thanks for the quick response. I only have one IndexWriter, and there
>> are
>> > no other processes accessing this particular index. I have tried deleting
>> > the entire index and reconstructing it, but the index corruption is
>> > repeatable. Incidentally, there are no new writes since the last commit
>> when
>> > the merge happens. I have over-padded my code with ReadWrite locks to
>> make
>> > sure that no writes/read are happening between the commits,
>> optimizations,
>> > and reopening of the index.
>> >
>> >
>> > Here is a snippet of the thread I use to maintain the Index ( I hope that
>> I
>> > am not doing something terribly wrong):
>> >            while (true) {
>> >                try {
>> >                    getWriteLock();
>> >                    indexWriter.commit();
>> >                   if (shouldOptimize()) {
>> >                       indexWriter.optimize();
>> >                    }
>> >
>> >                    IndexReader oldIR = indexSearcher.getIndexReader();
>> >                    IndexReader ir = oldIR.reopen();
>> >                    if (ir != oldIR) {
>> >                        IndexSearcher oldIS = indexSearcher;
>> >                        indexSearcher = new IndexSearcher(ir);
>> >                        oldIS.close();
>> >                        oldIR.close();
>> >                } catch (Throwable t) {
>> >                    trace.error(t, t);
>> >                } finally {
>> >                    releaseWriteLock();
>> >                }
>> >
>> >
>> > Regards,
>> > Khawaja
>> >
>> > On Thu, Apr 9, 2009 at 5:05 PM, Michael McCandless <
>> > luc...@mikemccandless.com> wrote:
>> >
>> >> These are serious corruption exceptions.
>> >>
>> >> Is it at all possible two writers are accessing the index at the same
>> time?
>> >>
>> >> Can you describe more about how you're using Lucene?
>> >>
>> >> Mike
>> >>
>> >> On Thu, Apr 9, 2009 at 7:59 PM, Khawaja Shams <kssh...@gmail.com>
>> wrote:
>> >> > Hello,
>> >> >  I am having a problem with reopening the IndexReader with Lucene 2.4
>> ( I
>> >> > updated to 2.4.1, but still no luck). The exception is preceded by an
>> >> > exception in optimizing the index. I am not reopening the reader while
>> >> the
>> >> > commit or optimization is going on in the writer (optimizing happens
>> in
>> >> the
>> >> > same thread, but much less often). The issues go away once I turn off
>> >> > optimizations. I was also getting this problem before I turned off the
>> >> use
>> >> > of compound files. I would appreciate any guidance.
>> >> >
>> >> > Thanks!
>> >> >
>> >> > Regards,
>> >> > Khawaja
>> >> >
>> >> >
>> >> > 2009-04-09 15:57:47,033 (941820) [Index Maint Thread] ERROR
>> >> > gov.nasa.ensemble.core.indexer.Indexer  - java.io.IOException:
>> background
>> >> > merge hit exception: _8:C41258 _9:C11382 into _a [optimize]
>> >> > java.io.IOException: background merge hit exception: _8:C41258
>> _9:C11382
>> >> > into _a [optimize]
>> >> >    at
>> org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2273)
>> >> >    at
>> org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2218)
>> >> >    at
>> org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2198)
>> >> >    at
>> >> >
>> >>
>> gov.nasa.ensemble.core.indexer.Indexer$IndexMaintThread.run(Indexer.java:102)
>> >> > Caused by: org.apache.lucene.index.CorruptIndexException: doc counts
>> >> differ
>> >> > for segment _8: fieldsReader shows 30074 but segmentInfo shows 41258
>> >> >    at
>> >> >
>> org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:362)
>> >> >    at
>> org.apache.lucene.index.SegmentReader.get(SegmentReader.java:306)
>> >> >    at
>> org.apache.lucene.index.SegmentReader.get(SegmentReader.java:260)
>> >> >    at
>> >> > org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4220)
>> >> >    at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3884)
>> >> >    at
>> >> >
>> >>
>> org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:205)
>> >> >    at
>> >> >
>> >>
>> org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:260)
>> >> > Exception in thread "Lucene Merge Thread #0"
>> >> > org.apache.lucene.index.MergePolicy$MergeException:
>> >> > org.apache.lucene.index.CorruptIndexException: doc counts differ for
>> >> segment
>> >> > _8: fieldsReader shows 30074 but segmentInfo shows 41258
>> >> >    at
>> >> >
>> >>
>> org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:309)
>> >> >    at
>> >> >
>> >>
>> org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:286)
>> >> > Caused by: org.apache.lucene.index.CorruptIndexException: doc counts
>> >> differ
>> >> > for segment _8: fieldsReader shows 30074 but segmentInfo shows 41258
>> >> >    at
>> >> >
>> org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:362)
>> >> >    at
>> org.apache.lucene.index.SegmentReader.get(SegmentReader.java:306)
>> >> >    at
>> org.apache.lucene.index.SegmentReader.get(SegmentReader.java:260)
>> >> >    at
>> >> > org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4220)
>> >> >    at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3884)
>> >> >    at
>> >> >
>> >>
>> org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:205)
>> >> >    at
>> >> >
>> >>
>> org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:260)
>> >> > 2009-04-09 15:57:52,814 (947601) [Index Maint Thread] ERROR
>> >> > gov.nasa.ensemble.core.indexer.Indexer  -
>> java.io.FileNotFoundException:
>> >> > /opt/users/merops/index/_a.fnm (No such file or directory)
>> >> > java.io.FileNotFoundException: /opt/users/merops/index/_a.fnm (No such
>> >> file
>> >> > or directory)
>> >> >    at java.io.RandomAccessFile.open(Native Method)
>> >> >    at java.io.RandomAccessFile.<init>(RandomAccessFile.java:212)
>> >> >    at
>> >> >
>> >>
>> org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.<init>(FSDirectory.java:552)
>> >> >    at
>> >> >
>> >>
>> org.apache.lucene.store.FSDirectory$FSIndexInput.<init>(FSDirectory.java:582)
>> >> >    at
>> org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:488)
>> >> >    at
>> org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:482)
>> >> >    at org.apache.lucene.index.FieldInfos.<init>(FieldInfos.java:58)
>> >> >    at
>> >> >
>> org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:341)
>> >> >    at
>> org.apache.lucene.index.SegmentReader.get(SegmentReader.java:306)
>> >> >    at
>> org.apache.lucene.index.SegmentReader.get(SegmentReader.java:269)
>> >> >    at
>> >> >
>> >>
>> org.apache.lucene.index.MultiSegmentReader.doReopen(MultiSegmentReader.java:201)
>> >> >    at
>> >> >
>> >>
>> org.apache.lucene.index.DirectoryIndexReader$2.doBody(DirectoryIndexReader.java:157)
>> >> >    at
>> >> >
>> >>
>> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:653)
>> >> >    at
>> >> >
>> >>
>> org.apache.lucene.index.DirectoryIndexReader.reopen(DirectoryIndexReader.java:179)
>> >> >    at
>> >> >
>> >>
>> gov.nasa.ensemble.core.indexer.Indexer$IndexMaintThread.run(Indexer.java:106)
>> >> >
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> >> For additional commands, e-mail: java-user-h...@lucene.apache.org
>> >>
>> >>
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Exceptions in merge thread (while optimizing) causing problems with subsequent reopens

Reply via email to