Before 2.4 it was possible that a crash of the OS, or sudden power loss to the machine, could corrupt the index. But that's been fixed with 2.4.
The only known sources of corruption are hardware faults (bad RAM, bad disk, etc.), and, accidentally allowing 2 writers to write to the same index at once (this will very quickly cause corruption). Lucene's write lock normally prevents this from happening. kill -9, JRE crashing, OS crashing, power loss, etc., should not cause corruption for Lucene >= 2.4. Backing up is definitely a good idea -- Lucene's SnapshotDeletionPolicy makes it easy to do a hot backup (backup even though IndexWriter is still changing the index). There's a paper on this (NOTE: I'm the author!) available at http://www.manning.com/hatcher3/ that gives details (look for "Hot backups with Lucene (green paper - html)". Mike On Wed, Nov 25, 2009 at 10:37 AM, Max Lynch <ihas...@gmail.com> wrote: > On Wed, Nov 25, 2009 at 9:31 AM, Ian Lea <ian....@gmail.com> wrote: > >> > What are the typical scenarios when the index will go corrupt? >> >> Dodgy disks. >> > > I also have had index corruption on two occasions. It is not a big deal for > me since my data is fairly real time so the old documents aren't as > important. > > However, I'm running this on a VPS with slicehost, so whether or not they > use dodgy disks is not something I can confirm or even deal with. > > I do need to upgrade to 2.9 from 2.4, but I think one of the reasons for my > index corruption is deleting the index.write file rather than removing the > lock through the Lucene APIs. This seems like it could be a cause of > corruption, correct? > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org