I manually did CheckIndex in all indexes and found two with issues: first Segments file=segments_42w numSegments=21 version=FORMAT_HAS_PROX [Lucene 2.4] 1 of 21: name=_109 docCount=10410 compound=true hasProx=true numFiles=1 size (MB)=55,789 no deletions test: open reader.........OK test: fields..............OK [6 fields] test: field norms.........OK [6 fields] test: terms, freq, prox...OK [170764 terms; 6214606 terms/docs pairs; 29694242 tokens] test: stored fields.......OK [10410 total field count; avg 1 fields per doc] test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc]
2 of 21: name=_160 docCount=2178 compound=true hasProx=true numFiles=1 size (MB)=11,676 no deletions test: open reader.........OK test: fields..............OK [6 fields] test: field norms.........OK [6 fields] test: terms, freq, prox...OK [92275 terms; 1264593 terms/docs pairs; 5807908 tokens] test: stored fields.......OK [2178 total field count; avg 1 fields per doc] test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc] 3 of 21: name=_1i9 docCount=2225 compound=true hasProx=true numFiles=1 size (MB)=10,872 no deletions test: open reader.........OK test: fields..............OK [6 fields] test: field norms.........OK [6 fields] test: terms, freq, prox...OK [92284 terms; 1193562 terms/docs pairs; 5253891 tokens] test: stored fields.......OK [2225 total field count; avg 1 fields per doc] test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc] 4 of 21: name=_1qx docCount=2654 compound=true hasProx=true numFiles=1 size (MB)=14,374 no deletions test: open reader.........OK test: fields..............OK [6 fields] test: field norms.........OK [6 fields] test: terms, freq, prox...OK [95153 terms; 1623058 terms/docs pairs; 7151784 tokens] test: stored fields.......OK [2654 total field count; avg 1 fields per doc] test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc] 5 of 21: name=_248 docCount=2165 compound=true hasProx=true numFiles=1 size (MB)=11,446 no deletions test: open reader.........OK test: fields..............OK [6 fields] test: field norms.........OK [6 fields] test: terms, freq, prox...OK [92177 terms; 1243801 terms/docs pairs; 5632974 tokens] test: stored fields.......OK [2165 total field count; avg 1 fields per doc] test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc] 6 of 21: name=_2a4 docCount=2508 compound=true hasProx=true numFiles=1 size (MB)=13,917 no deletions test: open reader.........OK test: fields..............OK [6 fields] test: field norms.........OK [6 fields] test: terms, freq, prox...OK [94222 terms; 1566966 terms/docs pairs; 6967989 tokens] test: stored fields.......OK [2508 total field count; avg 1 fields per doc] test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc] 7 of 21: name=_2pu docCount=2092 compound=true hasProx=true numFiles=1 size (MB)=10,217 no deletions test: open reader.........OK test: fields..............OK [6 fields] test: field norms.........OK [6 fields] test: terms, freq, prox...OK [91592 terms; 1105091 terms/docs pairs; 4865932 tokens] test: stored fields.......OK [2092 total field count; avg 1 fields per doc] test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc] 8 of 21: name=_35q docCount=3285 compound=true hasProx=true numFiles=1 size (MB)=16,378 no deletions test: open reader.........OK test: fields..............OK [6 fields] test: field norms.........OK [6 fields] test: terms, freq, prox...ERROR [term attachments:c7a0e2a47ebad4214e933621f6c9dac7: doc 426 <= lastDoc 426] java.lang.RuntimeException: term attachments:c7a0e2a47ebad4214e933621f6c9dac7: doc 426 <= lastDoc 426 at org.apache.lucene.index.CheckIndex.testTermIndex(CheckIndex.java:644) at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:530) at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:319) at org.getopt.luke.Luke$6.run(Unknown Source) test: stored fields.......OK [3285 total field count; avg 1 fields per doc] test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc] FAILED WARNING: fixIndex() would remove reference to this segment; full exception: java.lang.RuntimeException: Term Index test failed at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:543) at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:319) at org.getopt.luke.Luke$6.run(Unknown Source) 9 of 21: name=_3g7 docCount=2016 compound=true hasProx=true numFiles=1 size (MB)=11,813 no deletions test: open reader.........OK test: fields..............OK [6 fields] test: field norms.........OK [6 fields] test: terms, freq, prox...OK [91094 terms; 1286115 terms/docs pairs; 5942307 tokens] test: stored fields.......OK [2016 total field count; avg 1 fields per doc] test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc] 10 of 21: name=_3o8 docCount=1702 compound=true hasProx=true numFiles=1 size (MB)=10,008 no deletions test: open reader.........OK test: fields..............OK [6 fields] test: field norms.........OK [6 fields] test: terms, freq, prox...OK [89179 terms; 1081620 terms/docs pairs; 4960310 tokens] test: stored fields.......OK [1702 total field count; avg 1 fields per doc] test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc] 11 of 21: name=_3wq docCount=2249 compound=true hasProx=true numFiles=1 size (MB)=13,441 no deletions test: open reader.........OK test: fields..............OK [6 fields] test: field norms.........OK [6 fields] test: terms, freq, prox...OK [92829 terms; 1491635 terms/docs pairs; 6851263 tokens] test: stored fields.......OK [2249 total field count; avg 1 fields per doc] test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc] 12 of 21: name=_3zd docCount=2264 compound=true hasProx=true numFiles=1 size (MB)=14,153 no deletions test: open reader.........OK test: fields..............OK [6 fields] test: field norms.........OK [6 fields] test: terms, freq, prox...OK [92747 terms; 1602790 terms/docs pairs; 7248272 tokens] test: stored fields.......OK [2264 total field count; avg 1 fields per doc] test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc] 13 of 21: name=_44r docCount=2743 compound=true hasProx=true numFiles=1 size (MB)=16,56 no deletions test: open reader.........OK test: fields..............OK [6 fields] test: field norms.........OK [6 fields] test: terms, freq, prox...OK [95135 terms; 1817602 terms/docs pairs; 8534778 tokens] test: stored fields.......OK [2743 total field count; avg 1 fields per doc] test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc] 14 of 21: name=_4fl docCount=2105 compound=true hasProx=true numFiles=1 size (MB)=13,041 no deletions test: open reader.........OK test: fields..............OK [6 fields] test: field norms.........OK [6 fields] test: terms, freq, prox...OK [91940 terms; 1447522 terms/docs pairs; 6565862 tokens] test: stored fields.......OK [2105 total field count; avg 1 fields per doc] test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc] 15 of 21: name=_4gf docCount=1891 compound=true hasProx=true numFiles=1 size (MB)=10,88 no deletions test: open reader.........OK test: fields..............OK [6 fields] test: field norms.........OK [6 fields] test: terms, freq, prox...OK [90339 terms; 1186203 terms/docs pairs; 5392055 tokens] test: stored fields.......OK [1891 total field count; avg 1 fields per doc] test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc] 16 of 21: name=_54z docCount=2007 compound=true hasProx=true numFiles=1 size (MB)=10,848 no deletions test: open reader.........OK test: fields..............OK [6 fields] test: field norms.........OK [6 fields] test: terms, freq, prox...OK [91154 terms; 1199551 terms/docs pairs; 5284819 tokens] test: stored fields.......OK [2007 total field count; avg 1 fields per doc] test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc] 17 of 21: name=_5ao docCount=530 compound=true hasProx=true numFiles=1 size (MB)=3,743 no deletions test: open reader.........OK test: fields..............OK [6 fields] test: field norms.........OK [6 fields] test: terms, freq, prox...OK [76964 terms; 311259 terms/docs pairs; 1569162 tokens] test: stored fields.......OK [530 total field count; avg 1 fields per doc] test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc] 18 of 21: name=_5am docCount=113 compound=true hasProx=true numFiles=1 size (MB)=2,118 no deletions test: open reader.........OK test: fields..............OK [6 fields] test: field norms.........OK [6 fields] test: terms, freq, prox...OK [66266 terms; 142915 terms/docs pairs; 711223 tokens] test: stored fields.......OK [113 total field count; avg 1 fields per doc] test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc] 19 of 21: name=_5ap docCount=117 compound=true hasProx=true numFiles=1 size (MB)=0,525 no deletions test: open reader.........OK test: fields..............OK [6 fields] test: field norms.........OK [6 fields] test: terms, freq, prox...OK [15930 terms; 50518 terms/docs pairs; 203306 tokens] test: stored fields.......OK [117 total field count; avg 1 fields per doc] test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc] 20 of 21: name=_5ar docCount=165 compound=true hasProx=true numFiles=1 size (MB)=0,73 no deletions test: open reader.........OK test: fields..............OK [6 fields] test: field norms.........OK [6 fields] test: terms, freq, prox...OK [22756 terms; 67573 terms/docs pairs; 273697 tokens] test: stored fields.......OK [165 total field count; avg 1 fields per doc] test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc] 21 of 21: name=_5at docCount=115 compound=true hasProx=true numFiles=1 size (MB)=0,409 no deletions test: open reader.........OK test: fields..............OK [6 fields] test: field norms.........OK [6 fields] test: terms, freq, prox...OK [14805 terms; 40657 terms/docs pairs; 139527 tokens] test: stored fields.......OK [115 total field count; avg 1 fields per doc] test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc] WARNING: 1 broken segments (containing 3285 documents) detected =============================================================================================== second: Segments file=segments_7s4 numSegments=12 version=FORMAT_HAS_PROX [Lucene 2.4] 1 of 12: name=_3hm docCount=17119 compound=true hasProx=true numFiles=1 size (MB)=88,715 no deletions test: open reader.........OK test: fields..............OK [6 fields] test: field norms.........OK [6 fields] test: terms, freq, prox...ERROR [term body:mas: doc 13174 <= lastDoc 13174] java.lang.RuntimeException: term body:mas: doc 13174 <= lastDoc 13174 at org.apache.lucene.index.CheckIndex.testTermIndex(CheckIndex.java:644) at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:530) at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:319) at org.getopt.luke.Luke$6.run(Unknown Source) test: stored fields.......OK [17119 total field count; avg 1 fields per doc] test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc] FAILED WARNING: fixIndex() would remove reference to this segment; full exception: java.lang.RuntimeException: Term Index test failed at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:543) at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:319) at org.getopt.luke.Luke$6.run(Unknown Source) 2 of 12: name=_4bl docCount=3302 compound=true hasProx=true numFiles=1 size (MB)=19,256 no deletions test: open reader.........OK test: fields..............OK [6 fields] test: field norms.........OK [6 fields] test: terms, freq, prox...OK [98534 terms; 2136125 terms/docs pairs; 9970629 tokens] test: stored fields.......OK [3302 total field count; avg 1 fields per doc] test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc] 3 of 12: name=_4w9 docCount=4380 compound=true hasProx=true numFiles=1 size (MB)=27,348 no deletions test: open reader.........OK test: fields..............OK [6 fields] test: field norms.........OK [6 fields] test: terms, freq, prox...OK [103306 terms; 3081204 terms/docs pairs; 14235295 tokens] test: stored fields.......OK [4380 total field count; avg 1 fields per doc] test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc] 4 of 12: name=_5jh docCount=4372 compound=true hasProx=true numFiles=1 size (MB)=28,867 no deletions test: open reader.........OK test: fields..............OK [6 fields] test: field norms.........OK [6 fields] test: terms, freq, prox...OK [103363 terms; 3187068 terms/docs pairs; 15176139 tokens] test: stored fields.......OK [4372 total field count; avg 1 fields per doc] test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc] 5 of 12: name=_6nn docCount=6526 compound=true hasProx=true numFiles=1 size (MB)=32,316 no deletions test: open reader.........OK test: fields..............OK [6 fields] test: field norms.........OK [6 fields] test: terms, freq, prox...OK [111787 terms; 3720002 terms/docs pairs; 16710886 tokens] test: stored fields.......OK [6526 total field count; avg 1 fields per doc] test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc] 6 of 12: name=_82m docCount=6889 compound=true hasProx=true numFiles=1 size (MB)=36,753 no deletions test: open reader.........OK test: fields..............OK [6 fields] test: field norms.........OK [6 fields] test: terms, freq, prox...OK [113120 terms; 4284180 terms/docs pairs; 19188993 tokens] test: stored fields.......OK [6889 total field count; avg 1 fields per doc] test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc] 7 of 12: name=_8kv docCount=2286 compound=true hasProx=true numFiles=1 size (MB)=13,367 no deletions test: open reader.........OK test: fields..............OK [6 fields] test: field norms.........OK [6 fields] test: terms, freq, prox...OK [93101 terms; 1437797 terms/docs pairs; 6844358 tokens] test: stored fields.......OK [2286 total field count; avg 1 fields per doc] test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc] 8 of 12: name=_8i0 docCount=141 compound=true hasProx=true numFiles=1 size (MB)=1,618 no deletions test: open reader.........OK test: fields..............OK [6 fields] test: field norms.........OK [6 fields] test: terms, freq, prox...OK [63287 terms; 85330 terms/docs pairs; 414225 tokens] test: stored fields.......OK [141 total field count; avg 1 fields per doc] test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc] 9 of 12: name=_8kt docCount=275 compound=true hasProx=true numFiles=1 size (MB)=2,549 no deletions test: open reader.........OK test: fields..............OK [6 fields] test: field norms.........OK [6 fields] test: terms, freq, prox...OK [73391 terms; 173116 terms/docs pairs; 847099 tokens] test: stored fields.......OK [275 total field count; avg 1 fields per doc] test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc] 10 of 12: name=_8n3 docCount=103 compound=true hasProx=true numFiles=1 size (MB)=0,618 no deletions test: open reader.........OK test: fields..............OK [6 fields] test: field norms.........OK [6 fields] test: terms, freq, prox...OK [18925 terms; 50696 terms/docs pairs; 248460 tokens] test: stored fields.......OK [103 total field count; avg 1 fields per doc] test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc] 11 of 12: name=_8n4 docCount=3 compound=true hasProx=true numFiles=1 size (MB)=1,209 no deletions test: open reader.........OK test: fields..............OK [6 fields] test: field norms.........OK [6 fields] test: terms, freq, prox...OK [49084 terms; 49234 terms/docs pairs; 267700 tokens] test: stored fields.......OK [3 total field count; avg 1 fields per doc] test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc] 12 of 12: name=_8n5 docCount=122 compound=true hasProx=true numFiles=1 size (MB)=0,66 no deletions test: open reader.........OK test: fields..............OK [6 fields] test: field norms.........OK [6 fields] test: terms, freq, prox...OK [20211 terms; 54518 terms/docs pairs; 267390 tokens] test: stored fields.......OK [122 total field count; avg 1 fields per doc] test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc] WARNING: 1 broken segments (containing 17119 documents) detected I have not been able to reproduce this in my env. javier On Fri, Nov 27, 2009 at 12:23 PM, jm <jmugur...@gmail.com> wrote: > Ok, I got the index from the production machine, but I am having some > problem to find the index..., our process deals with multiple indexes, > in the current exception I cannot see any indication about the index > having the issue. I opened all my indexes with luke and old opened > succesfully, some had around 4000 docs (so I can discard those I > think), and some had around 40.000 docs, I was hoping one that had > around 24658 docs, but I guess this is the number of docs in one > segment, not the whole index right? > > Any idea to help find the index? > > In the codebase it would be nice to give any hint about the index > (path in case of a fsdirectory etc) when found an exception, dont > assume there is a single index in the application. > > javier > > On Fri, Nov 27, 2009 at 11:39 AM, Michael McCandless > <luc...@mikemccandless.com> wrote: >> Also, if you're able to reproduce this, can you call >> writer.setInfoStream and capture & post the resulting output leading >> up to the exception? >> >> Mike >> >> On Thu, Nov 26, 2009 at 7:12 AM, jm <jmugur...@gmail.com> wrote: >>> The process is still running and ops dont want to stop it. As soon as >>> stops I'll try checkindex. >>> >>> Its created brand new with 2.4.1 >>> >>> On Thu, Nov 26, 2009 at 12:42 PM, Michael McCandless >>> <luc...@mikemccandless.com> wrote: >>>> I think you're using a JRE that has the fix for the issue found in >>>> LUCENE-1282. >>>> >>>> Can you run CheckIndex on your index and post the output? >>>> >>>> Was this index created from scratch on Lucene 2.4.1? Or, created from >>>> an earlier Lucene version? >>>> >>>> Mike >>>> >>>> On Thu, Nov 26, 2009 at 6:03 AM, jm <jmugur...@gmail.com> wrote: >>>>> or are we really? I think we are on 1.6 update 14 right?? >>>>> >>>>> sorry Im lost right now on jdk version numbering >>>>> >>>>> On Thu, Nov 26, 2009 at 12:01 PM, jm <jmugur...@gmail.com> wrote: >>>>>> on second thought...I hadnt noticed the jdk numbers properly, we are >>>>>> using using b28, and JDK 6 Update 10 (b28) is the one fixing this... >>>>>> >>>>>> ok forget this then >>>>>> thanks! >>>>>> >>>>>> On Thu, Nov 26, 2009 at 11:55 AM, jm <jmugur...@gmail.com> wrote: >>>>>>> Hi, >>>>>>> >>>>>>> Dont know if this should be here or in java-dev, posting to this one >>>>>>> first. In one of our installations, we have encountered an exception: >>>>>>> >>>>>>> Exception in thread "Lucene Merge Thread #0" >>>>>>> org.apache.lucene.index.MergePolicy$MergeException: >>>>>>> org.apache.lucene.index.CorruptIndexException: docs out of order >>>>>>> (24658 <= 24658 ) >>>>>>> at >>>>>>> org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:309) >>>>>>> at >>>>>>> org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:286) >>>>>>> Caused by: org.apache.lucene.index.CorruptIndexException: docs out of >>>>>>> order (24658 <= 24658 ) >>>>>>> at >>>>>>> org.apache.lucene.index.SegmentMerger.appendPostings(SegmentMerger.java:641) >>>>>>> at >>>>>>> org.apache.lucene.index.SegmentMerger.mergeTermInfo(SegmentMerger.java:586) >>>>>>> at >>>>>>> org.apache.lucene.index.SegmentMerger.mergeTermInfos(SegmentMerger.java:547) >>>>>>> at >>>>>>> org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:500) >>>>>>> at >>>>>>> org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:140) >>>>>>> at >>>>>>> org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4231) >>>>>>> at >>>>>>> org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3884) >>>>>>> at >>>>>>> org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:205) >>>>>>> at >>>>>>> org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:260) >>>>>>> >>>>>>> I have investigated in the list and it looked like >>>>>>> https://issues.apache.org/jira/browse/LUCENE-1282. But we are using >>>>>>> 2.4.1, and >>>>>>> C:\Documents and Settings\Administrator>java -version >>>>>>> java version "1.6.0_14" >>>>>>> Java(TM) SE Runtime Environment (build 1.6.0_14-b08) >>>>>>> Java HotSpot(TM) Client VM (build 14.0-b16, mixed mode) >>>>>>> >>>>>>> java process launched like this: >>>>>>> java -server -Xmx1536m ... >>>>>>> >>>>>>> So I understand this bug should not be happening?? >>>>>>> >>>>>>> any idea? >>>>>>> thanks >>>>>>> javi >>>>>>> >>>>>> >>>>> >>>>> --------------------------------------------------------------------- >>>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>>>> For additional commands, e-mail: java-user-h...@lucene.apache.org >>>>> >>>>> >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>>> For additional commands, e-mail: java-user-h...@lucene.apache.org >>>> >>>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>> For additional commands, e-mail: java-user-h...@lucene.apache.org >>> >>> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org