It's better to use SnapshotDeletionPolicy to grab a consistent image of the index. You don't need to close the IndexWriter, nor stop making changes through IndexWriter, and it lets you capture a given segments_N (and all index files it needs) and then take your time making a copy/backup/etc of all files in the snapshot.

There's a "green paper", excerpted from the upcoming Lucene in Action revision, that covers how to use SnapshotDeletionPolicy for backing up an index:

  http://manning.com/free/green_HotBackupsLucene.html

(Disclaimers: 1) I wrote the article, 2) The link is frustrating because you have to submit your email address, then get email w/ a link that gives you a zip file, which you then unzip and open the index.html... I've been meaning to post the article directly to the Wiki so now seems like a good time!).

Mike

Michael Stoppelman wrote:

Hi Yonik,

Thanks for the response.

reply inline.

On Tue, Dec 16, 2008 at 6:44 AM, Yonik Seeley <ysee...@gmail.com> wrote:

On Tue, Dec 16, 2008 at 1:04 AM, Michael Stoppelman <stop...@gmail.com >
wrote:
I've got a question from Doug's original email about replication (
http://www.mail-archive.com/lucene-u...@jakarta.apache.org/msg12709.html
):

"1. On the index master, periodically checkpoint the index. Every minute
or
so the IndexWriter is closed and a 'cp -lr index index.DATE' command is
executed from Java, where DATE is the current date and time. This
efficiently makes a copy of the index when its in a consistent state by constructing a tree of hard links. If Lucene re-writes any files (e.g.,
the
segments file) a new inode is created and the copy is unchanged."

Is closing the IndexWriter really a requirement on taking a snapshot? Or
can
one take a snapshot on an index being written, I've done this in my
development environment and it seems to work fine w/o closing the
IndexWriter.

There are subtle race conditions if you try to do this with a changing
index.
At any instance in time, the index should be consistent, *but* you
can't actually make a snapshot instantaneously.


Is the race condition in writing out the segments.gen or segments_N files? From my understanding index segments once closed by the IndexWriter they aren't modified again (they might be deleted though if they're merged away).



So this is doable, but it would require some complex retry logic like
IndexReader has when opening an index.

Also the solr replication shell scripts don't seem to worry
about this either.

Solr takes snapshots when it knows it's not updating the index (new
index changes are internally blocked when calling snapshooter).

-Yonik

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to