OK, this is why Lucene (and Luke) consider the index fine, ie, if Lucene has problems opening segments_N (all 0s is definitely not a valid segments_N file), it falls back to the last commit (segments_(N-1)) and opens that instead.
Ie, IR.open and new IW(...) open the last successful commit. Mike McCandless http://blog.mikemccandless.com On Tue, Jun 28, 2011 at 8:28 AM, Tarr, Gregory <gregory.t...@detica.com> wrote: > There was a segments_(N-1), which was a valid segments file and opened > correctly in luke. > > The trouble came because we had to manually rename these files in order to > prevent the index from being wiped. > > Thanks > > Greg > > -----Original Message----- > From: Michael McCandless [mailto:luc...@mikemccandless.com] > Sent: 28 June 2011 13:26 > To: java-user@lucene.apache.org > Subject: Re: Corrupt segments file full of zeros > > Is there only one segments_N file in the index (the one with all 0s)? > Or is there a segments_(N-1) too? > > Mike McCandless > > http://blog.mikemccandless.com > > On Tue, Jun 28, 2011 at 8:17 AM, Tarr, Gregory <gregory.t...@detica.com> > wrote: >> We don't have a -9 in the file. It isn't a valid lucene segments file, >> as it only contains zeros. >> >> We're wondering why this opens in Luke, and why the CheckIndex reports >> that the index is OK. >> >> -----Original Message----- >> From: mark harwood [mailto:markharw...@yahoo.co.uk] >> Sent: 28 June 2011 13:09 >> To: java-user@lucene.apache.org >> Subject: Re: Corrupt segments file full of zeros >> >> According to the spec there should at least be an Int32 of -9 to >> declare the Format - >> http://lucene.apache.org/java/2_9_3/fileformats.html#Segments File >> >> >> >> ----- Original Message ---- >> From: Uwe Schindler <u...@thetaphi.de> >> To: java-user@lucene.apache.org >> Sent: Tue, 28 June, 2011 12:32:34 >> Subject: RE: Corrupt segments file full of zeros >> >> So where is the problem at all? Why should a segments file not contain >> lots of zeroes? If the index is not corrupt all is fine. >> >> ----- >> Uwe Schindler >> H.-H.-Meier-Allee 63, D-28213 Bremen >> http://www.thetaphi.de >> eMail: u...@thetaphi.de >> >> >>> -----Original Message----- >>> From: Tarr, Gregory [mailto:gregory.t...@detica.com] >>> Sent: Tuesday, June 28, 2011 11:56 AM >>> To: java-user@lucene.apache.org >>> Subject: RE: Corrupt segments file full of zeros >>> >>> Yes I have done that, and you just get "No problems were detected >>> with >> this >>> index" >>> >>> Surely there is a major problem with this index? >>> >>> Also the check() procedure takes a long time - is there any way you >> can >> just >>> do a health check on the segments file? >>> >>> Thanks >>> >>> Greg >>> >>> -----Original Message----- >>> From: Shai Erera [mailto:ser...@gmail.com] >>> Sent: 28 June 2011 10:36 >>> To: java-user@lucene.apache.org >>> Subject: Re: Corrupt segments file full of zeros >>> >>> You can try the CheckIndex tool. You feed it a directory and call >>> .check() and it reports the results. >>> >>> Shai >>> >>> On Tue, Jun 28, 2011 at 11:46 AM, Tarr, Gregory >>> <gregory.t...@detica.com>wrote: >>> >>> > We have a problem with our fileserver where our indexes are hosted >>> > remotely, using Lucene 2.9.3. >>> > >>> > This can mean that a segments file is written which is full of >>> > ASCII zeros. Using the od -ah command, we get: >>> > >>> > 0000000 nul nul nul nul nul nul nul....etc >>> > >>> > If opened in Luke, the index opens successfully but has zero >>> documents. >>> > >>> > Why does this open correctly in luke, and is there a procedure in >> the >>> > lucene code that can verify a segments file, e.g. check whether it >>> > refers to any segments? >>> > >>> > Thanks >>> > >>> > Greg >>> > >>> > >>> > Please consider the environment before printing this email. >>> > >>> > This message should be regarded as confidential. If you have >> received >>> > this email in error please notify the sender and destroy it >>> immediately. >>> > >>> > Statements of intent shall only become binding when confirmed in >> hard >>> > copy by an authorised signatory. The contents of this email may >>> > relate to dealings with other companies under the control of Detica >>> > Limited, details of which can be found at >>> http://www.detica.com/statutory-information. >>> > >>> > Detica Limited is registered in England under No: 1337451. >>> > Registered offices: Surrey Research Park, Guildford, Surrey, GU2 >> 7YP, >>> > England. >>> > >>> > >>> Please consider the environment before printing this email. >>> >>> This message should be regarded as confidential. If you have received >> this >>> email in error please notify the sender and destroy it immediately. >>> >>> Statements of intent shall only become binding when confirmed in hard >> copy >>> by an authorised signatory. The contents of this email may relate to >> dealings >>> with other companies under the control of Detica Limited, details of >> which >>> can be found at http://www.detica.com/statutory-information. >>> >>> Detica Limited is registered in England under No: 1337451. >>> Registered offices: Surrey Research Park, Guildford, Surrey, GU2 7YP, >>> England. >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> Please consider the environment before printing this email. >> >> This message should be regarded as confidential. If you have received this >> email in error please notify the sender and destroy it immediately. >> >> Statements of intent shall only become binding when confirmed in hard copy >> by an authorised signatory. The contents of this email may relate to >> dealings with other companies under the control of Detica Limited, details >> of which can be found at http://www.detica.com/statutory-information. >> >> Detica Limited is registered in England under No: 1337451. >> Registered offices: Surrey Research Park, Guildford, Surrey, GU2 7YP, >> England. >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > Please consider the environment before printing this email. > > This message should be regarded as confidential. If you have received this > email in error please notify the sender and destroy it immediately. > > Statements of intent shall only become binding when confirmed in hard copy by > an authorised signatory. The contents of this email may relate to dealings > with other companies under the control of Detica Limited, details of which > can be found at http://www.detica.com/statutory-information. > > Detica Limited is registered in England under No: 1337451. > Registered offices: Surrey Research Park, Guildford, Surrey, GU2 7YP, England. > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org