Re: Index size for Same DataSet.

2014-03-25 Thread Jose Carlos Canova
Hi, Thanks a lot for the clarifying. Will do that (force merge) at end, just to check if all things at my side (:-)) are doing right. att. On Tue, Mar 25, 2014 at 5:41 AM, Uwe Schindler wrote: > Hi, > > The reason for this is multithreaded merging. While indexing, Lucene > merges segments in

RE: Index size for Same DataSet.

2014-03-25 Thread Uwe Schindler
Hi, The reason for this is multithreaded merging. While indexing, Lucene merges segments in a separate threads. As this runs multithreaded, there is no strict "order of things". Depending on how fast the disk is or what other processes are running in parallel, the merging may proceed fast or sl

Re: Index size for Same DataSet.

2014-03-25 Thread Erick Erickson
You're probably fine. Part of indexing is merging segments, and when segments are merged the data from deleted (or updated) documents is reclaimed. Any slight variance in the commit algorithm will potentially reclaim more or less space. What happens if you optimize (forceMerge) as a final step? Th