Hi Mike, Apologies for the delay in getting back. I have since figured out that the reason Luke gave an error when we searched on the "fixed" index was (possibly) because it was a really old version (0.6 2005/02/16) - I tried again with v 0.8.1 (2008-02-13) and Luke can search on the "fixed" index just fine.
When I first ran CheckIndex the exception was as follows: java.lang.RuntimeException: term body:03: doc -2147483645 < lastDoc 44 at org.apache.lucene.index.CheckIndex.check(CheckIndex.java:195) at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:373) WARNING: 1 broken segments detected WARNING: 162 documents will be lost So for now I think the problem may lie in CLucene (or possibly our use of it) rather than CheckIndex not detecting and repairing properly. I will follow up with an update if/when I figure it out! In the meantime if you have any ideas as to how/why a corruption of this nature can occur I'd be glad to hear them. Thanks for your help, John. -----Original Message----- >From Michael McCandless <[EMAIL PROTECTED]> Subject Re: CheckIndex possibly not detecting/fixing all corruptions? Date Wed, 30 Jul 2008 10:43:29 GMT Do you have the exception Luke produced? That'd be a good clue as to what CheckIndex is not detecting. It's hard for me to tell from that GDB trace exactly what's gone wrong... When you first ran CheckIndex, and it detected one corrupt segment, what exception did it report as the cause of the corruption on that segment? If you could upload the index somewhere, that'd be great -- I'll try to figure out why CheckIndex fails to detect the corruption. It's particularly odd that you were able to successfully optimize the index. If you run searches on your index use Lucene java, do you get any odd exceptions? Mike -----Original Message----- From: John O'Brien [mailto:[EMAIL PROTECTED] Sent: 30 July 2008 11:22 To: 'java-user@lucene.apache.org' Subject: CheckIndex possibly not detecting/fixing all corruptions? Hi, I already posted this question on the CLucene dev list but it was suggested that I may be able to get some help on the Java list so here goes. We use Clucene 0.9.20 in our search engine. One of the indexes appears to have become corrupt (still investigating the cause of the corruption). We tried using the Java Lucene CheckIndex tool to fix the corruption(s). CheckIndex detected 1 broken segment and removed 162 documents and wrote a new segments file. Subsequent runs of CheckIndex report "No problems were detected with this index.". However our search engine still crashes when its performing a search on the "fixed" index. Here is the relevant backtrace: #8 0x001c0faa in lucene::index::SegmentTermDocs::read (this=0x664f26c8, docs=0x664f9944, freqs=0x664f99c4, length=32) at CLucene/util/BitSet.h:35 #9 0x001b4a4a in lucene::index::MultiTermDocs::read (this=0x664e5d38, docs=0x664f9944, freqs=0x664f99c4, length=32) at ../src/CLucene/index/MultiReader.cpp:400 #10 0x001ef7dc in lucene::search::TermScorer::next (this=0x664f9920) at ../src/CLucene/search/TermScorer.cpp:41 #11 0x001d2d23 in lucene::search::BooleanScorer::next (this=0x665f2fc0) at ../src/CLucene/search/BooleanScorer.cpp:63 #12 0x001d2d23 in lucene::search::BooleanScorer::next (this=0x664e5ec8) at ../src/CLucene/search/BooleanScorer.cpp:63 #13 0x001e3275 in lucene::search::IndexSearcher::_search ( this=0x65ba6410, query=0x66734ed8, filter=0x0, nDocs=100) at ../src/CLucene/search/Scorer.h:39 #14 0x001e1fb0 in lucene::search::Hits::getMoreDocs (this=0x66881440, m=50) at ../src/CLucene/search/Hits.cpp:110 #15 0x001e1acf in lucene::search::Hits::Hits () at CLucene/util/PriorityQueue.h:35 We tried to perform a search using Luke tool but this also resulted in an error. Also tried after optimizing the index db, but the same error persists. So it looks like the index db might still be corrupt. Any ideas as to why CheckIndex appears not to have detected/fixed all corruptions? Are there any other suggestions as to how to detect/repair index corruptions? Thanks in advance, John. P.s. I still have the index in question before and after running CheckIndex fix on it so if anyone's willing to take a look at it I can upload it somewhere for you to download. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]