Hi, Thanks a lot for the clarifying. Will do that (force merge) at end, just to check if all things at my side (:-)) are doing right.
att. On Tue, Mar 25, 2014 at 5:41 AM, Uwe Schindler <u...@thetaphi.de> wrote: > Hi, > > The reason for this is multithreaded merging. While indexing, Lucene > merges segments in a separate threads. As this runs multithreaded, there is > no strict "order of things". Depending on how fast the disk is or what > other processes are running in parallel, the merging may proceed fast or > slower so creating another "index structure", where different segments are > merged in other combinations, leading to different term dictionary or > posting list sizes. > > If you do a forceMerge(1) at the end (can take very long time), the whole > index is merged into one segment, which should have the same size for the > same dataset. Please don't compare file MD5/SHA1, the files will *not* be > identical, because order of documents may still vary. > > ----- > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > > > > -----Original Message----- > > From: Jose Carlos Canova [mailto:jose.carlos.can...@gmail.com] > > Sent: Tuesday, March 25, 2014 6:36 AM > > To: java-user@lucene.apache.org > > Subject: Index size for Same DataSet. > > > > Hello, > > > > I have a doubt about index size, > > I am testing a program using Lucene to index some dataset. > > > > At the final the result of index size is varying a little, since i > haven't finished > > the tests at all, i'm doubt if it is normal the index size vary on size > among > > different tests. > > > > att. > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >