I've got Greg's bad segment file and it does look to be all zeros and if I drop it into an existing index directory with the name segment_N+1 it reproduces the error i.e. IndexReader opens the index as if it contains zero docs. Preparing a Jira as we speak.
----- Original Message ---- From: Michael McCandless <luc...@mikemccandless.com> To: java-user@lucene.apache.org Sent: Tue, 28 June, 2011 14:59:48 Subject: Re: Corrupt segments file full of zeros On Tue, Jun 28, 2011 at 9:29 AM, mark harwood <markharw...@yahoo.co.uk> wrote: > Hi Mike. >>>Hmmm -- what code are you running here, to print the number of docs? > > SegmentInfos.setInfoStream(System.out); > FSDirectory dir = FSDirectory.open(new File("j:/indexes/myindex")); > IndexReader r = IndexReader.open(dir, true); > System.out.println("index has "+r.maxDoc()+" docs"); > > From my own tests outside of Greg's environment I've found Lucene to be doing > all the right things and IndexReader falls back gracefully to the previous > commit e.g. here is the output from when I deliberately killed an update after > prepareToCommit, leaving segment_2 and segment_3 and then vandalised segment_3 > with all zero bytes: > SIS [main]: directory listing genA=3 > SIS [main]: fallback check: 2; 2 > SIS [main]: segments.gen check: genB=2 > SIS [main]: primary Exception on 'segments_3': java.io.IOException: read past > EOF'; will retry: retry=false; gen = 3 > SIS [main]: fallback to prior segment file 'segments_2' > SIS [main]: success on fallback segments_2 > > Lucene does the right thing going back to _2. I can't yet see why in Greg's > environment (NFS based) it fails to see _4vc as corrupt in the same way the > above test correctly sees _3 as corrupt. Hmm. Mark, if you vandalise segments_3 with 0s, and then remove segmetns_2, what happens when you try to open the IndexReader? (I would expect exc). Greg, can you post the full stdout you see from SIS after enabling its infoStream in the case that returns an IR with 0 docs (ie when you delete segments_4vb). Also: if you don't delete any of the segments_N file, and run the same code, how many docs do you get? Mike --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org