So the segments.gen was written, but then the machine crashed before
Lucene could fsync it?

This is a "normal" case, and it should be fine for the filesystem to
return 0s to Lucene: Lucene is supposed to be robust to this
situation, and fallback to a directory listing finding the largest
segments_N file to try.  Do you not see that logic kicking in?

But, Lucene moved away from the .gen file in 5.0:
https://issues.apache.org/jira/browse/LUCENE-5925

Can you reproduce the corruption with newer ES/Lucene versions?

Mike McCandless

http://blog.mikemccandless.com


On Tue, Nov 8, 2016 at 2:58 AM, Dawid Weiss <[email protected]> wrote:
> Crazy. It would be helpful if you could provide a repro of this as a
> small Java program one could run on a (small) NTFS partition (perhaps
> beasting it over and over to simulate this effect?).
>
> Dawid
>
> On Tue, Nov 8, 2016 at 5:04 AM, Thomas Kappler
> <[email protected]> wrote:
>> Hi all,
>>
>>
>>
>> We’re occasionally observing corrupted indexes in production, on Windows
>> Server. We tracked it down to the way NTFS behaves in case of partial
>> writes.
>>
>>
>>
>> When the disk or the machine fail during a flush, it’s possible on NTFS that
>> the file being written to has already been extended to the new length, but
>> the content is not visible yet. For security reasons NTFS will return all 0s
>> for content when reading past the last successfully written point after the
>> system restarts.
>>
>>
>>
>> Lucene's commit code relies on committing an updated .gen file as the last
>> step of index flush/update. In this case, the file is there, but contains
>> 0s, making it unreadable for Lucene. Failures at this point leave the index
>> in a state that's not readable.
>>
>>
>>
>> We think that the safest approach, which is robust to reordered writes, is
>> to consider a gen file with all zeroes the same as a non-existing gen file.
>> This assumes that by the time the gen file is fsync'ed all other files have
>> been flushed to disk explicitly. If that's not the case, then there's still
>> exposure to reordered writes.
>>
>>
>>
>> I don’t have a repro at this point. Before digging deeper into this I wanted
>> to see what the Lucene devs think. Does the proposed fix make sense? Any
>> ideas on how to set up a reproducible test for this issue?
>>
>>
>>
>> We verified this on Elasticsearch 1.7.1 which uses Lucene 4.10.4. Are there
>> significant changes to this area in newer Lucene versions?
>>
>>
>>
>> // Thomas
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to