OK, this is why Lucene (and Luke) consider the index fine, ie, if
Lucene has problems opening segments_N (all 0s is definitely not a
valid segments_N file), it falls back to the last commit
(segments_(N-1)) and opens that instead.

Ie, IR.open and new IW(...) open the last successful commit.

Mike McCandless

http://blog.mikemccandless.com

On Tue, Jun 28, 2011 at 8:28 AM, Tarr, Gregory <gregory.t...@detica.com> wrote:
> There was a segments_(N-1), which was a valid segments file and opened 
> correctly in luke.
>
> The trouble came because we had to manually rename these files in order to 
> prevent the index from being wiped.
>
> Thanks
>
> Greg
>
> -----Original Message-----
> From: Michael McCandless [mailto:luc...@mikemccandless.com]
> Sent: 28 June 2011 13:26
> To: java-user@lucene.apache.org
> Subject: Re: Corrupt segments file full of zeros
>
> Is there only one segments_N file in the index (the one with all 0s)?
> Or is there a segments_(N-1) too?
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
> On Tue, Jun 28, 2011 at 8:17 AM, Tarr, Gregory <gregory.t...@detica.com> 
> wrote:
>> We don't have a -9 in the file. It isn't a valid lucene segments file,
>> as it only contains zeros.
>>
>> We're wondering why this opens in Luke, and why the CheckIndex reports
>> that the index is OK.
>>
>> -----Original Message-----
>> From: mark harwood [mailto:markharw...@yahoo.co.uk]
>> Sent: 28 June 2011 13:09
>> To: java-user@lucene.apache.org
>> Subject: Re: Corrupt segments file full of zeros
>>
>> According to the spec there should at least be an Int32 of  -9 to
>> declare the Format -
>> http://lucene.apache.org/java/2_9_3/fileformats.html#Segments File
>>
>>
>>
>> ----- Original Message ----
>> From: Uwe Schindler <u...@thetaphi.de>
>> To: java-user@lucene.apache.org
>> Sent: Tue, 28 June, 2011 12:32:34
>> Subject: RE: Corrupt segments file full of zeros
>>
>> So where is the problem at all? Why should a segments file not contain
>> lots of zeroes? If the index is not corrupt all is fine.
>>
>> -----
>> Uwe Schindler
>> H.-H.-Meier-Allee 63, D-28213 Bremen
>> http://www.thetaphi.de
>> eMail: u...@thetaphi.de
>>
>>
>>> -----Original Message-----
>>> From: Tarr, Gregory [mailto:gregory.t...@detica.com]
>>> Sent: Tuesday, June 28, 2011 11:56 AM
>>> To: java-user@lucene.apache.org
>>> Subject: RE: Corrupt segments file full of zeros
>>>
>>> Yes I have done that, and you just get "No problems were detected
>>> with
>> this
>>> index"
>>>
>>> Surely there is a major problem with this index?
>>>
>>> Also the check() procedure takes a long time - is there any way you
>> can
>> just
>>> do a health check on the segments file?
>>>
>>> Thanks
>>>
>>> Greg
>>>
>>> -----Original Message-----
>>> From: Shai Erera [mailto:ser...@gmail.com]
>>> Sent: 28 June 2011 10:36
>>> To: java-user@lucene.apache.org
>>> Subject: Re: Corrupt segments file full of zeros
>>>
>>> You can try the CheckIndex tool. You feed it a directory and call
>>> .check() and it reports the results.
>>>
>>> Shai
>>>
>>> On Tue, Jun 28, 2011 at 11:46 AM, Tarr, Gregory
>>> <gregory.t...@detica.com>wrote:
>>>
>>> > We have a problem with our fileserver where our indexes are hosted
>>> > remotely, using Lucene 2.9.3.
>>> >
>>> > This can mean that a segments file is written which is full of
>>> > ASCII zeros. Using the od -ah command, we get:
>>> >
>>> > 0000000 nul nul nul nul nul nul nul....etc
>>> >
>>> > If opened in Luke, the index opens successfully but has zero
>>> documents.
>>> >
>>> > Why does this open correctly in luke, and is there a procedure in
>> the
>>> > lucene code that can verify a segments file, e.g. check whether it
>>> > refers to any segments?
>>> >
>>> > Thanks
>>> >
>>> > Greg
>>> >
>>> >
>>> > Please consider the environment before printing this email.
>>> >
>>> > This message should be regarded as confidential. If you have
>> received
>>> > this email in error please notify the sender and destroy it
>>> immediately.
>>> >
>>> > Statements of intent shall only become binding when confirmed in
>> hard
>>> > copy by an authorised signatory.  The contents of this email may
>>> > relate to dealings with other companies under the control of Detica
>>> > Limited, details of which can be found at
>>> http://www.detica.com/statutory-information.
>>> >
>>> > Detica Limited is registered in England under No: 1337451.
>>> > Registered offices: Surrey Research Park, Guildford, Surrey, GU2
>> 7YP,
>>> > England.
>>> >
>>> >
>>> Please consider the environment before printing this email.
>>>
>>> This message should be regarded as confidential. If you have received
>> this
>>> email in error please notify the sender and destroy it immediately.
>>>
>>> Statements of intent shall only become binding when confirmed in hard
>> copy
>>> by an authorised signatory.  The contents of this email may relate to
>> dealings
>>> with other companies under the control of Detica Limited, details of
>> which
>>> can be found at http://www.detica.com/statutory-information.
>>>
>>> Detica Limited is registered in England under No: 1337451.
>>> Registered offices: Surrey Research Park, Guildford, Surrey, GU2 7YP,
>>> England.
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>
>> Please consider the environment before printing this email.
>>
>> This message should be regarded as confidential. If you have received this 
>> email in error please notify the sender and destroy it immediately.
>>
>> Statements of intent shall only become binding when confirmed in hard copy 
>> by an authorised signatory.  The contents of this email may relate to 
>> dealings with other companies under the control of Detica Limited, details 
>> of which can be found at http://www.detica.com/statutory-information.
>>
>> Detica Limited is registered in England under No: 1337451.
>> Registered offices: Surrey Research Park, Guildford, Surrey, GU2 7YP, 
>> England.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
> Please consider the environment before printing this email.
>
> This message should be regarded as confidential. If you have received this 
> email in error please notify the sender and destroy it immediately.
>
> Statements of intent shall only become binding when confirmed in hard copy by 
> an authorised signatory.  The contents of this email may relate to dealings 
> with other companies under the control of Detica Limited, details of which 
> can be found at http://www.detica.com/statutory-information.
>
> Detica Limited is registered in England under No: 1337451.
> Registered offices: Surrey Research Park, Guildford, Surrey, GU2 7YP, England.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to