It's better to use SnapshotDeletionPolicy to grab a consistent image
of the index. You don't need to close the IndexWriter, nor stop
making changes through IndexWriter, and it lets you capture a given
segments_N (and all index files it needs) and then take your time
making a copy/backup/etc of all files in the snapshot.
There's a "green paper", excerpted from the upcoming Lucene in Action
revision, that covers how to use SnapshotDeletionPolicy for backing up
an index:
http://manning.com/free/green_HotBackupsLucene.html
(Disclaimers: 1) I wrote the article, 2) The link is frustrating
because you have to submit your email address, then get email w/ a
link that gives you a zip file, which you then unzip and open the
index.html... I've been meaning to post the article directly to the
Wiki so now seems like a good time!).
Mike
Michael Stoppelman wrote:
Hi Yonik,
Thanks for the response.
reply inline.
On Tue, Dec 16, 2008 at 6:44 AM, Yonik Seeley <ysee...@gmail.com>
wrote:
On Tue, Dec 16, 2008 at 1:04 AM, Michael Stoppelman <stop...@gmail.com
>
wrote:
I've got a question from Doug's original email about replication (
http://www.mail-archive.com/lucene-u...@jakarta.apache.org/msg12709.html
):
"1. On the index master, periodically checkpoint the index. Every
minute
or
so the IndexWriter is closed and a 'cp -lr index index.DATE'
command is
executed from Java, where DATE is the current date and time. This
efficiently makes a copy of the index when its in a consistent
state by
constructing a tree of hard links. If Lucene re-writes any files
(e.g.,
the
segments file) a new inode is created and the copy is unchanged."
Is closing the IndexWriter really a requirement on taking a
snapshot? Or
can
one take a snapshot on an index being written, I've done this in my
development environment and it seems to work fine w/o closing the
IndexWriter.
There are subtle race conditions if you try to do this with a
changing
index.
At any instance in time, the index should be consistent, *but* you
can't actually make a snapshot instantaneously.
Is the race condition in writing out the segments.gen or segments_N
files?
From my understanding index segments once closed by the IndexWriter
they
aren't modified again (they might be deleted though if they're
merged away).
So this is doable, but it would require some complex retry logic like
IndexReader has when opening an index.
Also the solr replication shell scripts don't seem to worry
about this either.
Solr takes snapshots when it knows it's not updating the index (new
index changes are internally blocked when calling snapshooter).
-Yonik
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org