Thanks Mike. I'm using Windows XP with Java 1.6.0 and Lucene 2.4.1.
I don't know if I'm using the right backup strategy. I have an indexing process that happens from time to time and the index is getting every day bigger. Hence, copying all the index every time as a backup strategy is becoming a painful, endless activity. Giving this scenario, I wouldn't like to copy the entire index each time I do backup, but only the difference from the previous backup. Here is the plan: 1. Create an index writer; 2. Take a snapshot to prevent existing files from changing; 3. Start indexing until no more documents to be indexed exist; 4. Retrieve a list of index files that didn't change with this.snapshotDeletionPolicy.snapshot().getFileNames(); 5. Copy all other files inside the index folder that not those retrieved in step 4 to the backup directory; 6. Optimize and close the index. I really don't know if what I'm doing is the best approach to my problem. I believe that people use the SnapshotDeletionPolicy to copy all the index up to the last commit. I want to copy only the difference since last backup. What I'm thinking about doing now is to index in a new directory at every indexing iteration, copy this index to the backup folder and merge it with the main index afterwards. What do you guys think I should do? Lucas On Fri, Aug 14, 2009 at 8:05 PM, Michael McCandless < luc...@mikemccandless.com> wrote: > Alas I don't see it failing, with the optimize left in. Which exact > rev of 2.9 are you testing with? Which OS/filesystem/JRE? > > I realize this is just a test so what follows may not apply to your > "real" usage of SnapshotDeletionPolicy...: > > Since you're closing the writer before taking the backup, there's no > need to even use SnapshotDeletionPolicy (you can just copy all files > you find in the index). > > SnapshotDeletionPolicy's purpose is to enable taking backups while an > IndexWriter is still open & making ongoing changes to the index, ie a > "hot backup". > > Finally, you're taking the snapshot before doing any indexing... which > means your backup will only reflect the index as of the last commit > before you did indexing. > > Mike > > On Fri, Aug 14, 2009 at 4:55 PM, Lucas Nazário dos > Santos<nazario.lu...@gmail.com> wrote: > > Not as small as I would like, but shows the problem. > > > > If you remove the statement > > > > // Remove this and the backup works fine > > optimize(policy); > > > > the backup works wonderfully. > > > > (More code in the next e-mail) > > > > Lucas > > > > > > public static void main(final String[] args) throws > > CorruptIndexException, IOException, InterruptedException { > > final SnapshotDeletionPolicy policy = new > > SnapshotDeletionPolicy(new KeepOnlyLastCommitDeletionPolicy()); > > final IndexBackup backup = new IndexBackup(policy, new > > File("backup")); > > for (int i = 0; i < 3; i++) { > > index(policy, backup); > > } > > } > > > > private static void index(final SnapshotDeletionPolicy policy, > final > > IndexBackup backup) throws CorruptIndexException, > > LockObtainFailedException, IOException { > > IndexWriter writer = null; > > try { > > FSDirectory.setDisableLocks(true); > > writer = new > > IndexWriter(FSDirectory.getDirectory("index"), new StandardAnalyzer(), > > policy, > > MaxFieldLength.UNLIMITED); > > > > System.out.println("Star: " + > > backup.willBackupFromNowOn()); > > > > for (int i = 0; i < 10000; i++) { > > final Document document = new Document(); > > document.add(new Field("content", "content > > content content content", Store.YES, Index.ANALYZED)); > > writer.addDocument(document); > > } > > } finally { > > if (writer != null) { > > writer.close(); > > } > > > > System.out.println("Backup: " + backup.backup()); > > > > // Remove this and the backup works fine > > optimize(policy); > > } > > } > > > > private static void optimize(final SnapshotDeletionPolicy policy) > > throws CorruptIndexException, LockObtainFailedException, > > IOException { > > > > IndexWriter writer = null; > > try { > > writer = new > > IndexWriter(FSDirectory.getDirectory("index"), new StandardAnalyzer(), > > policy, > > MaxFieldLength.UNLIMITED); > > writer.optimize(); > > } finally { > > writer.close(); > > } > > } > > > > > > On Fri, Aug 14, 2009 at 4:37 PM, Shai Erera <ser...@gmail.com> wrote: > > > >> I think you should also delete files that don't exist anymore in the > index, > >> from the backup? > >> > >> Shai > >> > >> On Fri, Aug 14, 2009 at 10:02 PM, Michael McCandless < > >> luc...@mikemccandless.com> wrote: > >> > >> > Could you boil this down to a small standalone program showing the > >> problem? > >> > > >> > Optimizing in between backups should be completely fine. > >> > > >> > Mike > >> > > >> > On Fri, Aug 14, 2009 at 2:47 PM, Lucas Nazário dos > >> > Santos<nazario.lu...@gmail.com> wrote: > >> > > Hi, > >> > > > >> > > I'm using the SnapshotDeletionPolicy class to backup my index. I > >> > basically > >> > > call the snapshot() method from the class SnapshotDeletionPolicy at > >> some > >> > > point, get a list of files that changed, copy then to the backup > >> folder, > >> > and > >> > > finish by calling the release() method. > >> > > > >> > > The problem arises when, in between backups, I optimize the index by > >> > opening > >> > > it with the IndexWriter class and calling the optimize() method. > When I > >> > > don't optimize in between backups, here is what happens: > >> > > > >> > > The first backup copies the segment composed by the files _0.cfs and > >> > > segments_2. The second backup copies the files _1.cfs and > segments_3, > >> and > >> > > the third backup copies the files _2.cfs e segments_4. I can open > the > >> > backup > >> > > folder with Luke without problems. > >> > > > >> > > When I do optimize in between backups, the copies are as follow: > >> > > > >> > > The first backup copies the segment composed by the files _0.cfs and > >> > > segments_2. The second backup copies the files _1.cfs and > segments_3, > >> and > >> > > the third backup copies the files _3.cfs e segments_5. In this case, > >> when > >> > I > >> > > try to open the backup folder, Luke gives a message saying that it > >> can't > >> > > find the file _2.cfs. > >> > > > >> > > My question is: how can I backup my index using the > >> > SnapshotDeletionPolicy > >> > > and having to optimize the index in between backups? Am I using the > >> right > >> > > backup strategy? > >> > > > >> > > Thanks, > >> > > Lucas > >> > > > >> > > >> > --------------------------------------------------------------------- > >> > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > >> > For additional commands, e-mail: java-user-h...@lucene.apache.org > >> > > >> > > >> > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >