Re: CorruptIndexException after failed segment merge caused by No space left on device

2021-03-24 Thread Alexander Lukyanchikov
Thank you very much for the response! I've created a bug and added all relevant details there: https://issues.apache.org/jira/browse/LUCENE-9867 Please let me know if you have any questions, or if any other information would be helpful. -- Regards, Alexander L On Wed, Mar 24, 2021 at 10:09 AM Mi

Re: CorruptIndexException after failed segment merge caused by No space left on device

2021-03-24 Thread Michael McCandless
+1, this sounds like a bad bug in Lucene! We try hard to test for and prevent such bugs! As long as you succeeded in at least one commit since creating the index before you hit the disk full, restarting Lucene on the index should have recovered from that last successful commit. How often do you

Re: CorruptIndexException after failed segment merge caused by No space left on device

2021-03-24 Thread Robert Muir
On Wed, Mar 24, 2021 at 1:41 AM Alexander Lukyanchikov < alexanderlukyanchi...@gmail.com> wrote: > Hello everyone, > > Recently we had a failed segment merge caused by "No space left on device". > After restart, Lucene failed with the CorruptIndexException. > The expectation was that Lucene automa

Re: CorruptIndexException when opening Index during first commit

2013-05-30 Thread Michael McCandless
OK I opened https://issues.apache.org/jira/browse/LUCENE-5024 ... Geoff can you describe your idea there? Thanks. Mike McCandless http://blog.mikemccandless.com On Thu, May 30, 2013 at 7:35 AM, Michael McCandless wrote: > On Mon, May 20, 2013 at 9:22 AM, Geoff Cooney wrote: >>> The problem

Re: CorruptIndexException when opening Index during first commit

2013-05-30 Thread Michael McCandless
On Mon, May 20, 2013 at 9:22 AM, Geoff Cooney wrote: >> The problem is we can't reliably differentiate commit-in-progress from >> a corrupt first commit... > > I think you can tell them apart with high probability because the checksum > is off by exactly one(at least in lucene 3.5 where I'm lookin

Re: CorruptIndexException when opening Index during first commit

2013-05-20 Thread Geoff Cooney
> The problem is we can't reliably differentiate commit-in-progress from > a corrupt first commit... I think you can tell them apart with high probability because the checksum is off by exactly one(at least in lucene 3.5 where I'm looking). It does seem dangerous to rely on an implementation det

Re: CorruptIndexException when opening Index during first commit

2013-05-17 Thread Michael McCandless
On Thu, May 16, 2013 at 2:59 PM, Geoff Cooney wrote: > Thanks for the response, Mike. > > If I understand correctly, the problem was incorrectly identifying a large > corrupted index as a non-existant index? Actually, a large healthy index as non-existent (because of file descriptor exhaustion).

Re: CorruptIndexException when opening Index during first commit

2013-05-16 Thread Geoff Cooney
Thanks for the response, Mike. If I understand correctly, the problem was incorrectly identifying a large corrupted index as a non-existant index? It seems like you'd really want an index with first-commit in progress to behave like an index with zero documents, as opposed to a non-existant inde

Re: CorruptIndexException when opening Index during first commit

2013-05-16 Thread Michael McCandless
Unfortunately this is expected behavior. We tried to fix it in LUCENE-2812, but this fix was too dangerous and could sometimes erase a good index (if transient IOExcs are happening, e.g. due to file descriptor exhaustion) so we reverted back in LUCENE-4738, so that indexExists will return true, an

Re: CorruptIndexException workaround in 2.3-SNAPSHOT? (Attn: Michael McCandless)

2008-09-29 Thread Michael McCandless
Ari Miller wrote: Is there an available SNAPSHOT of the 2.3 branch with this fix? Unfortunately, no -- our nightly build process only builds the trunk's snapshot. Mike - To unsubscribe, e-mail: [EMAIL PROTECTED] For addi

Re: CorruptIndexException workaround in 2.3-SNAPSHOT? (Attn: Michael McCandless)

2008-09-26 Thread Ari Miller
Confirmed that the manifest date on the 2.3-SNAPSHOT is much older than the file date: Implementation-Version: 2.3-SNAPSHOT 613047 - hudson - 2008-01-18 04:1 1:25 Is there an available SNAPSHOT of the 2.3 branch with this fix? I've downloaded the 2.4 SNAPSHOT to see if this will resolve the corru

Re: CorruptIndexException workaround in 2.3-SNAPSHOT? (Attn: Michael McCandless)

2008-09-26 Thread Grant Ingersoll
On Sep 26, 2008, at 6:30 AM, Michael McCandless wrote: Ari Miller wrote: According to https://issues.apache.org/jira/browse/LUCENE-1282?focusedCommentId=12596949 #action_12596949 (Sun hotspot compiler bug in 1.6.0_04/05 affects Lucene), a workaround for the bug which causes the CorruptInd

Re: CorruptIndexException workaround in 2.3-SNAPSHOT? (Attn: Michael McCandless)

2008-09-26 Thread Michael McCandless
Ari Miller wrote: According to https://issues.apache.org/jira/browse/LUCENE-1282?focusedCommentId=12596949 #action_12596949 (Sun hotspot compiler bug in 1.6.0_04/05 affects Lucene), a workaround for the bug which causes the CorruptIndexException was put in to the 2.3 branch and 2.4. However, w

Re: CorruptIndexException with some versions of java

2008-03-24 Thread Michael McCandless
Just to bring closure here: this in fact looks like some sort of JVM hotspot compiler issue, as best we can tell. Running java with -Xbatch (forces up front compilation) prevents (works around) the issue. I've committed some additional assertions to the particular Lucene code (merging o

Re: CorruptIndexException with some versions of java

2008-03-18 Thread Michael McCandless
Ian can you attach your version of SegmentMerger.java? Somehow my lines are off from yours. Mike Ian Lea wrote: Mike Latest patch produces similar exception: Exception in thread "Lucene Merge Thread #0" org.apache.lucene.index.MergePolicy$MergeException: java.lang.AssertionError: after

Re: CorruptIndexException with some versions of java

2008-03-18 Thread Michael McCandless
Hi Ian, Sheesh that's odd. The SegmentMerger produced an .fdx file that is one document too short. Can you run with this patch now, again applied to head of 2.3 branch? I just added another assert inside the loop that does the field merging. I will scrutinize this code... Mike I

Re: CorruptIndexException with some versions of java

2008-03-18 Thread Michael McCandless
Ian, Could you apply the attached patch applied to the head of the 2.3 branch? It only adds more asserts, to try to pinpoint where exactly this corruption starts. Then, re-run the test with asserts enabled and infoStream turned on and post back. Thanks. Mike Ian Lea wrote: It'

Re: CorruptIndexException with some versions of java

2008-03-18 Thread Ian Lea
It's failed on servers running SuSE 10.0 and 8.2 (ancient!) $ uname -a shows Linux phoebe 2.6.13-15-smp #1 SMP Tue Sep 13 14:56:15 UTC 2005 x86_64 x86_64 x86_64 GNU/Linux and Linux phobos 2.4.20-64GB-SMP #1 SMP Mon Mar 17 17:56:03 UTC 2003 i686 unknown unknown GNU/Linux The first one has a 2.8G

Re: CorruptIndexException with some versions of java

2008-03-18 Thread Yonik Seeley
On Tue, Mar 18, 2008 at 7:38 AM, Ian Lea <[EMAIL PROTECTED]> wrote: > Hi > > > When bulk loading into a new index I'm seeing this exception > > Exception in thread "Thread-1" > org.apache.lucene.index.MergePolicy$MergeException: > org.apache.lucene.index.CorruptIndexException: doc counts differ

Re: CorruptIndexException with some versions of java

2008-03-18 Thread Michael McCandless
I don't see an attachment here -- maybe the mailing list software stripped it off. If so can you send directly to me? Thanks. Mike Ian Lea wrote: Documents are biblio records. All have title, author etc. stored, some have a few extra fields as well. Typically around 25 fields per doc.

Re: CorruptIndexException with some versions of java

2008-03-18 Thread Ian Lea
Documents are biblio records. All have title, author etc. stored, some have a few extra fields as well. Typically around 25 fields per doc. The index is created with compound format, everything else as default. I've rerun the job until failure. Different numbers this time, but basically the sa

Re: CorruptIndexException with some versions of java

2008-03-18 Thread Ian Lea
The data is loaded in chunks of up to 100K docs in separate runs of the program if that helps answer the first question. All buffers have default values, docs are small but not tiny, JVM is running with default settings. Answers to previous questions, and infostream, will follow once the job has

Re: CorruptIndexException with some versions of java

2008-03-18 Thread Michael McCandless
One question: do you know whether 67,861 docs "feels like" a newly flushed segment, or, the result of a merge? Ie, roughly how many docs are you buffering in IndexWriter before it flushes? Are they very small documents and your RAM buffer is large? Mike Ian Lea wrote: Hi When bulk l

Re: CorruptIndexException with some versions of java

2008-03-18 Thread Michael McCandless
Can you call IndexWriter.setInfoStream(...) and get the error to happen and post back the resulting output? And, turn on assertions (java -ea) since that may catch the issue sooner. Can you describe you are setting up IndexWriter (autoCommit, compound, etc.), and what your documents are

RE: CorruptIndexException

2007-11-29 Thread Melanie Langlois
Thank you, I indeed use newer version of Lucli by mistake. -Original Message- From: Michael McCandless [mailto:[EMAIL PROTECTED] Sent: Thursday, November 29, 2007 6:30 PM To: java-user@lucene.apache.org Subject: Re: CorruptIndexException That exception means your index was written

Re: CorruptIndexException

2007-11-29 Thread Michael McCandless
That exception means your index was written with a newer version of Lucene than the version you are using to open the IndexReader. It looks like you used the unreleased (2.3 dev) version of Lucli from the Lucene trunk and then went back to an older Lucene JAR (maybe 2.2?) for accessing it? In ge