Hello, My lucene index contains 46 segments with a total of 4M docs. Lately, while running queries I started getting seldom exceptions from this index:
java.lang.ArrayIndexOutOfBoundsException at org.apache.lucene.codecs.lucene41.ForUtil.readBlock(ForUtil.java196) at org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlcokDocsAndPostionsEnum.refillPositions(Lucene41PostingsReader.java:796) at org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlcokDocsAndPostionsEnum.skipPositions(Lucene41PostingsReader.java:961) at org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlcokDocsAndPostionsEnum.nextPosition(Lucene41PostingsReader.java:988) at org.apache.lucene.search.ExactPhraseScorer.phraseFreq(ExactPhraseScorer.java:213) at … looking at the code the exceptions comes from final int ecodedSize = encodedSizez[numBits]; These exceptions provoke query failures (about 5%, not sure what is the pattern of it). I run a checkIndex on this index, getting on one of the segments the following log: Segments file=segments_4k4 numSegments=46 version=4.3 format= userData={commitTimeMSec=1382425789488} 1 of 46: name=_1ye docCount=67529 Codec=Lucene42 Compound=false numFiles=15 size (MB)=4,922.155 diagnostics = {timestamp=1371533248779, os=Linux, ss.version=2.6.32-279.e16.x86_64, mergeFactor=15, source=merge, lucene.version=4.3.0 1477023 – simonw – 2013-04-29 14:55:14, os.arch=amd64, mergeMaxSumSegments=-1, java.version=1.7.0_11, java.vendor=Oracle Corporation} has deletions [delGen=417] test: open reader………FAILED WARNING: fixIndex() would remove reference to this segment; full excpetion: Java.lang.AsertionError: liveDocs.count()=40242 info,docCount=67529 info.getDelCount()=27193 At org.apache.lucene.codecs.lucene40.Lucene40LovieDocsFormat.readLiceDocs(Lucene40LiveDocsFormat.java:92) At org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:61) At org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:543) At org.apache.lucene.index.CheckIndex.main(CheckIndex.java:1854) While other segments look fine 2 of 46: name=_3wz docCount=130918 Codec=Lucene42 Compound=false numFiles=15 size (MB)=4,982.155 diagnostics = {timestamp=1372838229010, os=Linux, os.version=2.6.32-279.e16.x86_64, mergeFactor=15, source=merge, lucene.version=4.3.0 1477023 – simonw – 2013-04-29 14:55:14, os.arch=amd64, mergeMaxSumSegments=-1, java.version=1.7.0_11, java.vendor=Oracle Corporation} has deletions [delGen=552] test: open reader………OK [24610 deleted docs] test: fields………………..OK [235 fields] test: fields normas…..OK [29 fields] test: terms, freq, prox….OK [45127880 terms; 25487529 terms/docs pairs; 854489030 tokens] test (ignoring deletes): terms, freq, prox…OK [50784030; 305300244 terms/docs pairs; 854489030 tokens] test: stored fields…….OK [41472391 total field count; avg 390 fields per doc] test: term vectors……OK [268790 total vector count; avg 3.035 term/freq vector fields per doc] test: docvalues……….OK [0 total doc count;1 docalues fields] Does anyone know what kind of corruption might throw this exception on opening a reader? btw - The above index is one of few other shard in the same collection (managed in Solr). Other shards are in good state. Thanks in advance, Manuel