Hi, a commit is actually just the list of segments and their (unmutable) files. Because the files are not mutable, every commit point can safely refer to the same files which are also used by an earlier commit point. In your code, you should use a Set<String> instead of a List<String>. Depending on how many changes you have between the snapshots/commits, I would expect that most of the files overlap. If you only added one document to an index and then create a new snapshot, you would see basically the same files in both segments, while the newer one has one segment more (one with only one document). If you delete documents, the same happens, but then you get additional files in one of the earlier segments (delete generations), but the basic set of files would be identical. If (automatic) segment merging was done between two snapshots, of course more files can change, because smaller segments may got combined with other ones on newer snapshots.
Uwe ----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -----Original Message----- > From: Vitaly Funstein [mailto:vfunst...@gmail.com] > Sent: Friday, March 21, 2014 1:35 AM > To: java-user@lucene.apache.org > Subject: Segments reusable across commits? > > I have a usage pattern where I need to package up and store away all files > from an index referenced by multiple commit points. To that end, I basically > call IndexWriter.commit(), followed by SnapshotDeletionPolicy.snapshot(), > followed by something like this: > > List<String> files = new ArrayList<String>(dir.listAll().length); > for (IndexCommit commit: snapshotter.getSnapshots()) { > files.addAll(commit.getFileNames()); > } > > As it turns out, this creates duplicates, specifically some .si files appear > to be > present in multiple commit points. Is this expected, and if so > - does this mean that some commits are allowed to reuse segments created > by prior commits? I have always thought that each commit creates a new set > of segments... I'm using Lucene 4.6. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org