Simon Wistow wrote:

On Wed, Feb 27, 2008 at 09:38:55AM -0500, Michael McCandless said:

When you previously saw corruption was it due to an OS or machine
crash (or power cord got pulled)?  If so, you were likely hitting
LUCENE-1044, which is fixed on the trunk version of Lucene (to be 2.4
at some point) but is not fixed in 2.3.

Yes - it's power outages and other unnatural events (sysadmins
accidentally kill -9ing the process) that caused it.

OK power outage can definitely cause corruption.  This has been a long
standing, but only recently uncovered, and now fixed in 2.4, issue
(LUCENE-1044).  But I believe kill -9 should not cause corruption.

BTW hot backups, as of 2.3, are now very easy.  Just use
SnapshotDeletionPolicy when creating your writer.  Making frequent
backups is a good safeguard too...

What's the chances of me backporting the fix to 2.3 or should I just
wait for 2.4?

It unfortunately was a fairly large change; I'm not sure how cleanly
the patch will apply to 2.3.  Maybe try trunk (but beware: the index
format changed with LUCENE-1044 to add an integrity checksum to
the end of the segments_N file)...

Come 2.4 is my buffering to RAM redundant?

Well, as Mark said, if your IO system does not lie on fsync, then buffering to RAM is redundant. If it does lie, you still have open risk of corruption and
so buffering to RAM probably reduces (but doesn't eliminate) the risk.

Also, as of 2.3, manually buffering to RAMDirectory should no longer
give a big performance win over just giving that RAM to the
IndexWriter as its buffer instead.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to