Re: CorruptIndexException after failed segment merge caused by No space left on device

2021-03-24 Thread Alexander Lukyanchikov
> > > Recently we had a failed segment merge caused by "No space left on > > device". > > > After restart, Lucene failed with the CorruptIndexException. > > > The expectation was that Lucene automatically recovers in such > > > case, because there

Re: CorruptIndexException after failed segment merge caused by No space left on device

2021-03-24 Thread Michael McCandless
AM Alexander Lukyanchikov < > alexanderlukyanchi...@gmail.com> wrote: > > > Hello everyone, > > > > Recently we had a failed segment merge caused by "No space left on > device". > > After restart, Lucene failed with the CorruptIndexException. > > The

Re: CorruptIndexException after failed segment merge caused by No space left on device

2021-03-24 Thread Robert Muir
On Wed, Mar 24, 2021 at 1:41 AM Alexander Lukyanchikov < alexanderlukyanchi...@gmail.com> wrote: > Hello everyone, > > Recently we had a failed segment merge caused by "No space left on device". > After restart, Lucene failed with the CorruptIndexException. >

CorruptIndexException after failed segment merge caused by No space left on device

2021-03-23 Thread Alexander Lukyanchikov
Hello everyone, Recently we had a failed segment merge caused by "No space left on device". After restart, Lucene failed with the CorruptIndexException. The expectation was that Lucene automatically recovers in such case, because there was no succesul commit. Is it a correct assumptio

Re: CorruptIndexException when opening Index during first commit

2013-05-30 Thread Michael McCandless
OK I opened https://issues.apache.org/jira/browse/LUCENE-5024 ... Geoff can you describe your idea there? Thanks. Mike McCandless http://blog.mikemccandless.com On Thu, May 30, 2013 at 7:35 AM, Michael McCandless wrote: > On Mon, May 20, 2013 at 9:22 AM, Geoff Cooney wrote: >>> The problem

Re: CorruptIndexException when opening Index during first commit

2013-05-30 Thread Michael McCandless
On Mon, May 20, 2013 at 9:22 AM, Geoff Cooney wrote: >> The problem is we can't reliably differentiate commit-in-progress from >> a corrupt first commit... > > I think you can tell them apart with high probability because the checksum > is off by exactly one(at least in lucene 3.5 where I'm lookin

Re: CorruptIndexException when opening Index during first commit

2013-05-20 Thread Geoff Cooney
> The problem is we can't reliably differentiate commit-in-progress from > a corrupt first commit... I think you can tell them apart with high probability because the checksum is off by exactly one(at least in lucene 3.5 where I'm looking). It does seem dangerous to rely on an implementation det

Re: CorruptIndexException when opening Index during first commit

2013-05-17 Thread Michael McCandless
On Thu, May 16, 2013 at 2:59 PM, Geoff Cooney wrote: > Thanks for the response, Mike. > > If I understand correctly, the problem was incorrectly identifying a large > corrupted index as a non-existant index? Actually, a large healthy index as non-existent (because of file descriptor exhaustion).

Re: CorruptIndexException when opening Index during first commit

2013-05-16 Thread Geoff Cooney
n > LUCENE-4738, so that indexExists will return true, and trying to open > an IndexReader will throw CorruptIndexException, when the first-commit > has started but not yet finished. > > Mike McCandless > > http://blog.mikemccandless.com > > > On Thu, May 16, 2013

Re: CorruptIndexException when opening Index during first commit

2013-05-16 Thread Michael McCandless
, and trying to open an IndexReader will throw CorruptIndexException, when the first-commit has started but not yet finished. Mike McCandless http://blog.mikemccandless.com On Thu, May 16, 2013 at 10:45 AM, Geoff Cooney wrote: > Hi, > > We're occasionally seeing a CorruptIndexEx

CorruptIndexException when opening Index during first commit

2013-05-16 Thread Geoff Cooney
Hi, We're occasionally seeing a CorruptIndexException when a searcher is opened on a new index. When we see the exception, it looks like what is happening is that the searcher is opening the index after prepareCommit for segments_1 but before the commit is completed. Because there is no

Re: 回复: 回复: 回复: 回复: 回复: IndexReader.open and CorruptIndexException

2013-01-25 Thread Ian Lea
chael McCandless"; > 发送时间: 2013年1月26日(星期六) 凌晨0:35 > 收件人: "java-user"; > > 主题: Re: 回复: 回复: 回复: 回复: IndexReader.open and CorruptIndexException > > > > That should work. > > Mike McCandless > > http://blog.mikemccandless.com > > > On Fri, J

回复: 回复: 回复: 回复: 回复: IndexReader.open and CorruptIndexException

2013-01-25 Thread zhoucheng2008
"java-user"; 主题: Re: 回复: 回复: 回复: 回复: IndexReader.open and CorruptIndexException That should work. Mike McCandless http://blog.mikemccandless.com On Fri, Jan 25, 2013 at 11:27 AM, zhoucheng2008 wrote: > Sorry, I meant this: > > > SearcherManager sm = new Searcher

Re: 回复: 回复: 回复: 回复: IndexReader.open and CorruptIndexException

2013-01-25 Thread Michael McCandless
- > 发件人: "Ian Lea"; > 发送时间: 2013年1月26日(星期六) 凌晨0:16 > 收件人: "java-user"; > > 主题: Re: 回复: 回复: 回复: IndexReader.open and CorruptIndexException > > > >> Is SearcherFactory the same as SearcherManager? > > No. > >> Ian mentioned a new war

回复: 回复: 回复: 回复: IndexReader.open and CorruptIndexException

2013-01-25 Thread zhoucheng2008
Sorry, I meant this: SearcherManager sm = new SearcherManager(dir, new SearcherFactory()); -- 原始邮件 -- 发件人: "Ian Lea"; 发送时间: 2013年1月26日(星期六) 凌晨0:16 收件人: "java-user"; 主题: Re: 回复: 回复: 回复: IndexReader.open and CorruptIndexException >

Re: 回复: 回复: 回复: IndexReader.open and CorruptIndexException

2013-01-25 Thread Ian Lea
:10 > 收件人: "java-user"; > > 主题: Re: 回复: 回复: IndexReader.open and CorruptIndexException > > > > You can pass null for the SearcherFactory ... then SearcherManager > will just do new IndexSearcher(reader) for you. > > Mike McCandless > > http://blog.mik

回复: 回复: 回复: IndexReader.open and CorruptIndexException

2013-01-25 Thread dyzc2010
Is SearcherFactory the same as SearcherManager? Ian mentioned a new warmer() solution. Maybe I can try that first. -- 原始邮件 -- 发件人: "Michael McCandless"; 发送时间: 2013年1月26日(星期六) 凌晨0:10 收件人: "java-user"; 主题: Re: 回复: 回复: IndexReader.open and C

Re: 回复: 回复: IndexReader.open and CorruptIndexException

2013-01-25 Thread Michael McCandless
; > 发送时间: 2013年1月25日(星期五) 晚上9:26 > 收件人: "java-user"; > > 主题: Re: 回复: IndexReader.open and CorruptIndexException > > > > Maybe here?: > > > http://blog.mikemccandless.com/2011/11/near-real-time-readers-with-lucenes.html > > Mike McCandless &g

Re: 回复: 回复: IndexReader.open and CorruptIndexException

2013-01-25 Thread Ian Lea
ichael McCandless"; > 发送时间: 2013年1月25日(星期五) 晚上9:26 > 收件人: "java-user"; > > 主题: Re: 回复: IndexReader.open and CorruptIndexException > > > > Maybe here?: > > > http://blog.mikemccandless.com/2011/11/near-real-time-readers-with-lucenes.html > >

回复: 回复: IndexReader.open and CorruptIndexException

2013-01-25 Thread zhoucheng2008
ess"; 发送时间: 2013年1月25日(星期五) 晚上9:26 收件人: "java-user"; 主题: Re: 回复: IndexReader.open and CorruptIndexException Maybe here?: http://blog.mikemccandless.com/2011/11/near-real-time-readers-with-lucenes.html Mike McCandless http://blog.mikemccandless.com On Fri, Jan 25, 2013

Re: 回复: IndexReader.open and CorruptIndexException

2013-01-25 Thread Michael McCandless
ation somewhat self-contradictory. After I read >> it, I am confused if I should close the file handlers in the finally block >> or not. I am using Java. >> > >> > >> > >> > >> > -- 原始邮件 -- >> > 发件人

Re: 回复: IndexReader.open and CorruptIndexException

2013-01-25 Thread Cheng
ion somewhat self-contradictory. After I read > it, I am confused if I should close the file handlers in the finally block > or not. I am using Java. > > > > > > > > > > -- 原始邮件 -- > > 发件人: "Ian Lea"; > > 发送

Re: 回复: IndexReader.open and CorruptIndexException

2013-01-24 Thread Ian Lea
. > > > > > -- 原始邮件 -- > 发件人: "Ian Lea"; > 发送时间: 2013年1月24日(星期四) 下午5:46 > 收件人: "java-user"; > > 主题: Re: IndexReader.open and CorruptIndexException > > > > Well, raising the limits is one option but there may

回复: IndexReader.open and CorruptIndexException

2013-01-24 Thread zhoucheng2008
dlers in the finally block or not. I am using Java. -- 原始邮件 -- 发件人: "Ian Lea"; 发送时间: 2013年1月24日(星期四) 下午5:46 收件人: "java-user"; 主题: Re: IndexReader.open and CorruptIndexException Well, raising the limits is one option but there may be bet

Re: IndexReader.open and CorruptIndexException

2013-01-24 Thread Ian Lea
oint.java:990) >> at java.lang.Thread.run(Thread.java:722) > > > >> Too many open files... How to solve it? > > >> On Tue, Jan 22, 2013 at 10:52 PM, Michael McCandless < >> luc...@mikemccandless.com> wrote: > >>> Can you post the full stack trace of

Re: IndexReader.open and CorruptIndexException

2013-01-24 Thread Rafał Kuć
point$Acceptor.run(AprEndpoint.java:990) > at java.lang.Thread.run(Thread.java:722) > Too many open files... How to solve it? > On Tue, Jan 22, 2013 at 10:52 PM, Michael McCandless < > luc...@mikemccandless.com> wrote: >> Can you post the full stack trace

Re: IndexReader.open and CorruptIndexException

2013-01-24 Thread Cheng
(AprEndpoint.java:990) at java.lang.Thread.run(Thread.java:722) Too many open files... How to solve it? On Tue, Jan 22, 2013 at 10:52 PM, Michael McCandless < luc...@mikemccandless.com> wrote: > Can you post the full stack trace of the CorruptIndexException? > > Mike McCa

Re: IndexReader.open and CorruptIndexException

2013-01-22 Thread Michael McCandless
Can you post the full stack trace of the CorruptIndexException? Mike McCandless http://blog.mikemccandless.com On Tue, Jan 22, 2013 at 8:20 AM, Cheng wrote: > Hi, > > I run a Lucene application on Tomcat. The app will try to open a Linux > directory, and sometime returns CorruptIn

Re: MergePolicy$MergeException CorruptIndexException in lucene2.4.1

2009-11-30 Thread jm
On Mon, Nov 30, 2009 at 2:34 PM, Michael McCandless wrote: > On Mon, Nov 30, 2009 at 7:22 AM, jm wrote: >> No other exceptions I could spot. > > OK > >> OS: win2003 32bits, with NTFS. This is a vm running on vmware fusion on a >> mac. > > That should be fine... > >> jvm: I made sure, java versio

Re: MergePolicy$MergeException CorruptIndexException in lucene2.4.1

2009-11-30 Thread Michael McCandless
On Mon, Nov 30, 2009 at 7:22 AM, jm wrote: > No other exceptions I could spot. OK > OS: win2003 32bits, with NTFS. This is a vm running on vmware fusion on a mac. That should be fine... > jvm: I made sure, java version "1.6.0_14" Good. > IndexWriter settings: >        writer.setMaxFieldLengt

Re: MergePolicy$MergeException CorruptIndexException in lucene2.4.1

2009-11-30 Thread jm
No other exceptions I could spot. OS: win2003 32bits, with NTFS. This is a vm running on vmware fusion on a mac. jvm: I made sure, java version "1.6.0_14" IndexWriter settings: writer.setMaxFieldLength(maxFieldLength); writer.setMergeFactor(10); writer.setRAMBufferSizeMB(

Re: MergePolicy$MergeException CorruptIndexException in lucene2.4.1

2009-11-27 Thread jm
I'll check all thsese with my ops guy on monday and report back. Thanks for the interest. On Fri, Nov 27, 2009 at 4:00 PM, Michael McCandless wrote: > Any Lucene-related exceptions hit in your env?  What OS (looks like > Windows, but which one?), filesystem are you on? > > And are you really cert

Re: MergePolicy$MergeException CorruptIndexException in lucene2.4.1

2009-11-27 Thread Michael McCandless
Any Lucene-related exceptions hit in your env? What OS (looks like Windows, but which one?), filesystem are you on? And are you really certain about the java version being used in your production env? Don't just trust which java your interactive shell finds on its PATH -- double check how your a

Re: MergePolicy$MergeException CorruptIndexException in lucene2.4.1

2009-11-27 Thread Michael McCandless
dex > (path in case of a fsdirectory etc) when found an exception, dont > assume there is a single index in the application. I agree that'd be nice... we should probably fix CorruptIndexException to accept a Directory, which it then toString's and includes in the mes

Re: MergePolicy$MergeException CorruptIndexException in lucene2.4.1

2009-11-27 Thread jm
I manually did CheckIndex in all indexes and found two with issues: first Segments file=segments_42w numSegments=21 version=FORMAT_HAS_PROX [Lucene 2.4] 1 of 21: name=_109 docCount=10410 compound=true hasProx=true numFiles=1 size (MB)=55,789 no deletions test: open reader

Re: MergePolicy$MergeException CorruptIndexException in lucene2.4.1

2009-11-27 Thread jm
Ok, I got the index from the production machine, but I am having some problem to find the index..., our process deals with multiple indexes, in the current exception I cannot see any indication about the index having the issue. I opened all my indexes with luke and old opened succesfully, some had

Re: MergePolicy$MergeException CorruptIndexException in lucene2.4.1

2009-11-27 Thread Michael McCandless
Also, if you're able to reproduce this, can you call writer.setInfoStream and capture & post the resulting output leading up to the exception? Mike On Thu, Nov 26, 2009 at 7:12 AM, jm wrote: > The process is still running and ops dont want to stop it. As soon as > stops I'll try checkindex. > >

Re: MergePolicy$MergeException CorruptIndexException in lucene2.4.1

2009-11-26 Thread jm
The process is still running and ops dont want to stop it. As soon as stops I'll try checkindex. Its created brand new with 2.4.1 On Thu, Nov 26, 2009 at 12:42 PM, Michael McCandless wrote: > I think you're using a JRE that has the fix for the issue found in > LUCENE-1282. > > Can you run Check

Re: MergePolicy$MergeException CorruptIndexException in lucene2.4.1

2009-11-26 Thread Michael McCandless
I think you're using a JRE that has the fix for the issue found in LUCENE-1282. Can you run CheckIndex on your index and post the output? Was this index created from scratch on Lucene 2.4.1? Or, created from an earlier Lucene version? Mike On Thu, Nov 26, 2009 at 6:03 AM, jm wrote: > or are w

Re: MergePolicy$MergeException CorruptIndexException in lucene2.4.1

2009-11-26 Thread jm
or are we really? I think we are on 1.6 update 14 right?? sorry Im lost right now on jdk version numbering On Thu, Nov 26, 2009 at 12:01 PM, jm wrote: > on second thought...I hadnt noticed the jdk numbers properly, we are > using using b28, and JDK 6 Update 10 (b28) is the one fixing this... > >

Re: MergePolicy$MergeException CorruptIndexException in lucene2.4.1

2009-11-26 Thread jm
on second thought...I hadnt noticed the jdk numbers properly, we are using using b28, and JDK 6 Update 10 (b28) is the one fixing this... ok forget this then thanks! On Thu, Nov 26, 2009 at 11:55 AM, jm wrote: > Hi, > > Dont know if this should be here or in java-dev, posting to this one > first

MergePolicy$MergeException CorruptIndexException in lucene2.4.1

2009-11-26 Thread jm
Hi, Dont know if this should be here or in java-dev, posting to this one first. In one of our installations, we have encountered an exception: Exception in thread "Lucene Merge Thread #0" org.apache.lucene.index.MergePolicy$MergeException: org.apache.lucene.index.CorruptIndexException: docs out o

Re: CorruptIndexException workaround in 2.3-SNAPSHOT? (Attn: Michael McCandless)

2008-09-29 Thread Michael McCandless
Ari Miller wrote: Is there an available SNAPSHOT of the 2.3 branch with this fix? Unfortunately, no -- our nightly build process only builds the trunk's snapshot. Mike - To unsubscribe, e-mail: [EMAIL PROTECTED] For addi

Re: CorruptIndexException workaround in 2.3-SNAPSHOT? (Attn: Michael McCandless)

2008-09-26 Thread Ari Miller
workaround >> for the bug which causes the CorruptIndexException was put in to the >> 2.3 branch and 2.4. >> However, we are still experiencing this issue (intermittent creation >> of a corrupt index) with a 2.3-SNAPSHOT from maven. >> Was the workaround put into 2.3-SNAPSH

Re: CorruptIndexException workaround in 2.3-SNAPSHOT? (Attn: Michael McCandless)

2008-09-26 Thread Grant Ingersoll
CorruptIndexException was put in to the 2.3 branch and 2.4. However, we are still experiencing this issue (intermittent creation of a corrupt index) with a 2.3-SNAPSHOT from maven. Was the workaround put into 2.3-SNAPSHOT? Are there other issues which would cause the same error (detailed below)? We would

Re: CorruptIndexException workaround in 2.3-SNAPSHOT? (Attn: Michael McCandless)

2008-09-26 Thread Michael McCandless
Ari Miller wrote: According to https://issues.apache.org/jira/browse/LUCENE-1282?focusedCommentId=12596949 #action_12596949 (Sun hotspot compiler bug in 1.6.0_04/05 affects Lucene), a workaround for the bug which causes the CorruptIndexException was put in to the 2.3 branch and 2.4. However

CorruptIndexException workaround in 2.3-SNAPSHOT? (Attn: Michael McCandless)

2008-09-25 Thread Ari Miller
According to https://issues.apache.org/jira/browse/LUCENE-1282?focusedCommentId=12596949#action_12596949 (Sun hotspot compiler bug in 1.6.0_04/05 affects Lucene), a workaround for the bug which causes the CorruptIndexException was put in to the 2.3 branch and 2.4. However, we are still

Re: "Off By One": CorruptIndexException

2008-05-14 Thread Michael McCandless
OK thanks for the update. It's another datapoint, and it tells us _06 doesn't fix it. I'll add it to the Jira issue. Mike Ian Lea wrote: Hi My job (http://lucene.markmail.org/message/awkkunr7j24nh4qj) still fails with java version 1.6.0_06 (build 1.6.0_06-b02), downloaded today, with bo

Re: "Off By One": CorruptIndexException

2008-05-14 Thread Ian Lea
Hi My job (http://lucene.markmail.org/message/awkkunr7j24nh4qj) still fails with java version 1.6.0_06 (build 1.6.0_06-b02), downloaded today, with both lucene 2.3.1 and 2.3.2. For me, downgrading to 1.6.0_03-b05 fixed things. -- Ian. On Tue, May 13, 2008 at 7:56 PM, Stu Hood <[EMAIL PROTECT

"Off By One": CorruptIndexException

2008-05-13 Thread Stu Hood
Hey gang, I think we've been suffering from the following bug, and I have a question about the JVM fix. http://markmail.org/message/di3vdyfq5odfbai6 We're running 1.6.0_05 and Lucene 2.3.2. Supposedly downgrading to 1.6.0_02 will fix the issue, but I'd much rather upgrade if possible. 1.6.0_0

Re: CorruptIndexException with some versions of java

2008-03-24 Thread Michael McCandless
Just to bring closure here: this in fact looks like some sort of JVM hotspot compiler issue, as best we can tell. Running java with -Xbatch (forces up front compilation) prevents (works around) the issue. I've committed some additional assertions to the particular Lucene code (merging o

Re: CorruptIndexException with some versions of java

2008-03-18 Thread Michael McCandless
Ian can you attach your version of SegmentMerger.java? Somehow my lines are off from yours. Mike Ian Lea wrote: Mike Latest patch produces similar exception: Exception in thread "Lucene Merge Thread #0" org.apache.lucene.index.MergePolicy$MergeException: java.lang.AssertionError: after

Re: CorruptIndexException with some versions of java

2008-03-18 Thread Michael McCandless
Hi Ian, Sheesh that's odd. The SegmentMerger produced an .fdx file that is one document too short. Can you run with this patch now, again applied to head of 2.3 branch? I just added another assert inside the loop that does the field merging. I will scrutinize this code... Mike I

Re: CorruptIndexException with some versions of java

2008-03-18 Thread Michael McCandless
Ian, Could you apply the attached patch applied to the head of the 2.3 branch? It only adds more asserts, to try to pinpoint where exactly this corruption starts. Then, re-run the test with asserts enabled and infoStream turned on and post back. Thanks. Mike Ian Lea wrote: It'

Re: CorruptIndexException with some versions of java

2008-03-18 Thread Ian Lea
It's failed on servers running SuSE 10.0 and 8.2 (ancient!) $ uname -a shows Linux phoebe 2.6.13-15-smp #1 SMP Tue Sep 13 14:56:15 UTC 2005 x86_64 x86_64 x86_64 GNU/Linux and Linux phobos 2.4.20-64GB-SMP #1 SMP Mon Mar 17 17:56:03 UTC 2003 i686 unknown unknown GNU/Linux The first one has a 2.8G

Re: CorruptIndexException with some versions of java

2008-03-18 Thread Yonik Seeley
On Tue, Mar 18, 2008 at 7:38 AM, Ian Lea <[EMAIL PROTECTED]> wrote: > Hi > > > When bulk loading into a new index I'm seeing this exception > > Exception in thread "Thread-1" > org.apache.lucene.index.MergePolicy$MergeException: > org.apache.lucene.index.CorruptIndexException: doc counts differ

Re: CorruptIndexException with some versions of java

2008-03-18 Thread Michael McCandless
I don't see an attachment here -- maybe the mailing list software stripped it off. If so can you send directly to me? Thanks. Mike Ian Lea wrote: Documents are biblio records. All have title, author etc. stored, some have a few extra fields as well. Typically around 25 fields per doc.

Re: CorruptIndexException with some versions of java

2008-03-18 Thread Ian Lea
Documents are biblio records. All have title, author etc. stored, some have a few extra fields as well. Typically around 25 fields per doc. The index is created with compound format, everything else as default. I've rerun the job until failure. Different numbers this time, but basically the sa

Re: CorruptIndexException with some versions of java

2008-03-18 Thread Ian Lea
The data is loaded in chunks of up to 100K docs in separate runs of the program if that helps answer the first question. All buffers have default values, docs are small but not tiny, JVM is running with default settings. Answers to previous questions, and infostream, will follow once the job has

Re: CorruptIndexException with some versions of java

2008-03-18 Thread Michael McCandless
One question: do you know whether 67,861 docs "feels like" a newly flushed segment, or, the result of a merge? Ie, roughly how many docs are you buffering in IndexWriter before it flushes? Are they very small documents and your RAM buffer is large? Mike Ian Lea wrote: Hi When bulk l

Re: CorruptIndexException with some versions of java

2008-03-18 Thread Michael McCandless
Can you call IndexWriter.setInfoStream(...) and get the error to happen and post back the resulting output? And, turn on assertions (java -ea) since that may catch the issue sooner. Can you describe you are setting up IndexWriter (autoCommit, compound, etc.), and what your documents are

CorruptIndexException with some versions of java

2008-03-18 Thread Ian Lea
Hi When bulk loading into a new index I'm seeing this exception Exception in thread "Thread-1" org.apache.lucene.index.MergePolicy$MergeException: org.apache.lucene.index.CorruptIndexException: doc counts differ for segment _4l: fieldsReader shows 67861 but segmentInfo shows 67862 at or

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-12-18 Thread Grant Ingersoll
Hey Bill, Any status on this? On Dec 2, 2007, at 10:37 PM, Bill Janssen wrote: Hmmm, it still sounds like you are hitting a threading issue that is probably exacerbated by the multicore platform of the newer machine. Exactly what I was thinking. What are the details of the CPUs of these two

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-12-02 Thread Bill Janssen
> > Hmmm, it still sounds like you are hitting a threading issue that is > > probably exacerbated by the multicore platform of the newer machine. > > Exactly what I was thinking. > What are the details of the CPUs of these two systems? Ah, good point. The bad machine is a dual-processor 1GHz G4

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-12-02 Thread Yonik Seeley
On Dec 2, 2007 9:28 PM, Grant Ingersoll <[EMAIL PROTECTED]> wrote: > Hmmm, it still sounds like you are hitting a threading issue that is > probably exacerbated by the multicore platform of the newer machine. Exactly what I was thinking. What are the details of the CPUs of these two systems? -Yon

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-12-02 Thread Grant Ingersoll
Hmmm, it still sounds like you are hitting a threading issue that is probably exacerbated by the multicore platform of the newer machine. Is there anyway to put together a unit test that we can try? Thanks, Grant On Dec 2, 2007, at 9:10 PM, Bill Janssen wrote: I'll see if I can get back t

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-12-02 Thread Bill Janssen
> I'll see if I can get back to this over the weekend. I got a chance to copy my corpus to another G4 and try indexing with Lucene 2.2. This one seems OK! Same texts. So now I'm inclined to believe that it *is* the machine, rather than the code. Whew! Though that doesn't explain why 2.0 works

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-11-30 Thread Bill Janssen
> Your errors seem to happen around the same area (~20K docs). If you > skip the first say ~18K docs does the error still happen? We need to > somehow narrow this down. I'm trying to boil down the documents to a set which I can deploy on a DVD-ROM, so I can move the same set around from machine

RE: CorruptIndexException

2007-11-29 Thread Melanie Langlois
Thank you, I indeed use newer version of Lucli by mistake. -Original Message- From: Michael McCandless [mailto:[EMAIL PROTECTED] Sent: Thursday, November 29, 2007 6:30 PM To: java-user@lucene.apache.org Subject: Re: CorruptIndexException That exception means your index was written

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-11-29 Thread Grant Ingersoll
I have PPC and Intel access if that helps. Just need a test case. On Nov 29, 2007, at 5:37 PM, Michael McCandless wrote: "Bill Janssen" <[EMAIL PROTECTED]> wrote: No. It's in another location, but perhaps I can get it tomorrow. On the other hand, the success when using 2.0 makes it likely t

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-11-29 Thread Michael McCandless
"Grant Ingersoll" <[EMAIL PROTECTED]> wrote: > Just a theory (make that a guess), Mike, but is it possible that the > one merge scheduler is hitting a synchronization issue with the > deletedDocuments bit vector? That is one thread is cleaning it up and > the other is accessing and they are

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-11-29 Thread Michael McCandless
"Bill Janssen" <[EMAIL PROTECTED]> wrote: > No. It's in another location, but perhaps I can get it tomorrow. > On the other hand, the success when using 2.0 makes it likely to me > that the machine isn't the problem. Yeah good point. Seems like a long shot (wishful thinking on my part!). Your

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-11-29 Thread Grant Ingersoll
Just a theory (make that a guess), Mike, but is it possible that the one merge scheduler is hitting a synchronization issue with the deletedDocuments bit vector? That is one thread is cleaning it up and the other is accessing and they aren't synchronizing their access? This doesn't explain

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-11-29 Thread Michael McCandless
This is in the nightly JAR. It's o.a.l.index.CheckIndex (it defines a static main). Mike "Bill Janssen" <[EMAIL PROTECTED]> wrote: > > Also, could you try out the CheckIndex tool in 2.3-dev before and > > after the deletes? > > Great idea! I don't suppose there's a jar file of it? > > Bill

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-11-29 Thread Bill Janssen
So, it's a little clearer. I get the Array-out-of-bounds exception if I'm re-indexing some already indexed documents -- if there are deletions involved. I get the CorruptIndexException if I'm indexing freshly -- no deletions. Here's an example of that (with the latest nigh

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-11-29 Thread Bill Janssen
> Also, could you try out the CheckIndex tool in 2.3-dev before and > after the deletes? Great idea! I don't suppose there's a jar file of it? Bill - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail:

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-11-29 Thread Bill Janssen
> Have you tried another PPC machine? No. It's in another location, but perhaps I can get it tomorrow. On the other hand, the success when using 2.0 makes it likely to me that the machine isn't the problem. OK, I've reverted to my original codebase (where I first create a reader and do the dele

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-11-29 Thread Bill Janssen
> Could you post this part of the code (deleting) too? Here it is: private static void remove (File index_file, String[] doc_ids, int start) { String number; String list; Term term; TermDocs matches; if (debug_mode) System.err.println("in

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-11-29 Thread Grant Ingersoll
On Nov 29, 2007, at 2:26 PM, Bill Janssen wrote: Are you still getting the original exception too or just the Array out =20= of bounds one now? Also, are you doing anything else to the index =20 while this is happening? The code at the point in the exception below =20= is trying to p

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-11-29 Thread Michael McCandless
es the optimize with that writer, which is where > the CorruptIndexException started coming up. I'm going to run that > again with 2.0, then with last night's build. Could you post this part of the code (deleting) too? > I'm not sure if the success with 2.0 meant that a corrupted index

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-11-29 Thread Bill Janssen
n creates a new writer to re-index the same documents, then does the optimize with that writer, which is where the CorruptIndexException started coming up. I'm going to run that again with 2.0, then with last night's build. I'm not sure if the success with 2.0 meant that a corrupted index

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-11-29 Thread Grant Ingersoll
Are you still getting the original exception too or just the Array out of bounds one now? Also, are you doing anything else to the index while this is happening? The code at the point in the exception below is trying to properly handle deleted documents. -Grant On Nov 29, 2007, at 1:34 P

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-11-29 Thread Bill Janssen
> Can you try running with the trunk version of Lucene (2.3-dev) and see > if the error still occurs? EG you can download this AM's build here: > > > http://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightly/288/artifact/artifacts Still there. Here's the dump with last night's build: /L

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-11-29 Thread Bill Janssen
> > Another thing to try is turning on the infoStream > > (IndexWriter.setInfoStream(...)) and capture & post the resulting log. > > It will be very large since it takes quite a while for the error to > > occur... > > I can do that. Here's a more complete dump. I've modified the code so that I n

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-11-29 Thread Bill Janssen
> > Another thing to try is turning on the infoStream > > (IndexWriter.setInfoStream(...)) and capture & post the resulting log. > > It will be very large since it takes quite a while for the error to > > occur... > > I can do that. Here's what I see: Optimizing... merging segments _ram_a (1 doc

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-11-29 Thread Bill Janssen
> Do you have another PPC machine to reproduce this on? (To rule out > bad RAM/hard-drive on the first one). I'll dig up an old laptop and try it there. > Another thing to try is turning on the infoStream > (IndexWriter.setInfoStream(...)) and capture & post the resulting log. > It will be very

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-11-29 Thread Michael McCandless
"Bill Janssen" <[EMAIL PROTECTED]> wrote: > > Hmmm ... how many chunks of "about 50 pages" do you do before > > hitting this? Roughly how many docs are in the index when it > > happens? > > Oh, gosh, not sure. I'm guessing it's about half done. Ugh, OK. If we could boil this down to a smaller

Re: CorruptIndexException

2007-11-29 Thread Michael McCandless
That exception means your index was written with a newer version of Lucene than the version you are using to open the IndexReader. It looks like you used the unreleased (2.3 dev) version of Lucli from the Lucene trunk and then went back to an older Lucene JAR (maybe 2.2?) for accessing it? In ge

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-11-28 Thread Bill Janssen
> Hmmm ... how many chunks of "about 50 pages" do you do before hitting this? > Roughly how many docs are in the index when it happens? Oh, gosh, not sure. I'm guessing it's about half done. > Can you describe the docs/fields you're adding? I've got 1735 documents, 18969 pages -- average page s

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-11-28 Thread Bill Janssen
> I'm going to run the same software on an > Intel machine and see what happens. So, I ran the same codebase with lucene-core-2.2.0.jar on an Intel Mac Pro, OS X 10.5.0, Java 1.5, and no exception is raised. Different corpus, about 5 pages instead of 2. This is reinforcing my thinking th

CorruptIndexException

2007-11-28 Thread Melanie Langlois
Hi, I use Lucli to optimize my index, when my application was stopped. And after restarting my application, I could not serahc my index anymore, I got the following exception : org.apache.lucene.index.CorruptIndexException: Unknown format version: -4 at org.apache.lucene.index.Se

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-11-28 Thread Bill Janssen
> You are not hitting any other exception before this one right? > > Can you change your test case so that the "catch" clause is run > before the "finally" clause? I wonder if you are hitting some > interesting exception and then trying to optimize, which then > masks the original exception. Yes

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-11-28 Thread Michael McCandless
Hmmm ... how many chunks of "about 50 pages" do you do before hitting this? Roughly how many docs are in the index when it happens? Can you describe the docs/fields you're adding? You are not hitting any other exception before this one right? Can you change your test case so that the "catch" cl

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-11-28 Thread Bill Janssen
> Are you really sure in your 2.2 test you are starting with no prior > index? I'd ask that too, but yes, I'm really really sure. Building a completely new index each time. Works with 2.0.0. Fails with 2.2.0. Works with 2.2.0 *if* I remove the optimization step. Bill ---

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-11-28 Thread Michael McCandless
Are you really sure in your 2.2 test you are starting with no prior index? 2.2 should in fact work fine with a 2.0 index but it's possible there was some latent corruption in the 2.0 index if you are accidentally using it. That exception looks alot like this dreaded bug: https://issues.apache.

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-11-28 Thread Bill Janssen
I just tried re-indexing with lucene-core-2.0.0.jar and the same indexing code; works great. So what am I doing wrong with 2.2? Bill - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-11-28 Thread Bill Janssen
Here's the code I'm using: try { // Now add the documents to the index IndexWriter writer = new IndexWriter(index_loc, new StandardAnalyzer(), !index_loc.exists()); writer.setMaxFieldLength(Integer.MAX_VALUE); try { for (in

lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-11-28 Thread Bill Janssen
I've got a DB of about 2 pages which I thought I'd update to Lucene 2.2. I removed the old index (2.0 based) completely, and started re-indexing all the documents. I do this in stages, of about 50 pages at a time, serially, starting a new JVM each time, and reading in the existing index, then