So the segments.gen was written, but then the machine crashed before Lucene could fsync it?
This is a "normal" case, and it should be fine for the filesystem to return 0s to Lucene: Lucene is supposed to be robust to this situation, and fallback to a directory listing finding the largest segments_N file to try. Do you not see that logic kicking in? But, Lucene moved away from the .gen file in 5.0: https://issues.apache.org/jira/browse/LUCENE-5925 Can you reproduce the corruption with newer ES/Lucene versions? Mike McCandless http://blog.mikemccandless.com On Tue, Nov 8, 2016 at 2:58 AM, Dawid Weiss <[email protected]> wrote: > Crazy. It would be helpful if you could provide a repro of this as a > small Java program one could run on a (small) NTFS partition (perhaps > beasting it over and over to simulate this effect?). > > Dawid > > On Tue, Nov 8, 2016 at 5:04 AM, Thomas Kappler > <[email protected]> wrote: >> Hi all, >> >> >> >> We’re occasionally observing corrupted indexes in production, on Windows >> Server. We tracked it down to the way NTFS behaves in case of partial >> writes. >> >> >> >> When the disk or the machine fail during a flush, it’s possible on NTFS that >> the file being written to has already been extended to the new length, but >> the content is not visible yet. For security reasons NTFS will return all 0s >> for content when reading past the last successfully written point after the >> system restarts. >> >> >> >> Lucene's commit code relies on committing an updated .gen file as the last >> step of index flush/update. In this case, the file is there, but contains >> 0s, making it unreadable for Lucene. Failures at this point leave the index >> in a state that's not readable. >> >> >> >> We think that the safest approach, which is robust to reordered writes, is >> to consider a gen file with all zeroes the same as a non-existing gen file. >> This assumes that by the time the gen file is fsync'ed all other files have >> been flushed to disk explicitly. If that's not the case, then there's still >> exposure to reordered writes. >> >> >> >> I don’t have a repro at this point. Before digging deeper into this I wanted >> to see what the Lucene devs think. Does the proposed fix make sense? Any >> ideas on how to set up a reproducible test for this issue? >> >> >> >> We verified this on Elasticsearch 1.7.1 which uses Lucene 4.10.4. Are there >> significant changes to this area in newer Lucene versions? >> >> >> >> // Thomas >> >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
