Hi Vitaly, Your hunch is correct, yes there are unmerged segments leftover. However to get indexing throughput, we use multiple threads on the writer flushing to disk periodically, but the writer can stay open for some time (until the last thread terminates). However, after an optimize, the index is closed. Thanks for the advice, I need to revisit the merging section of the application.
Clive ________________________________ From: Vitaly Funstein <vfunst...@gmail.com> To: java-user@lucene.apache.org Sent: Friday, October 26, 2012 8:13 PM Subject: Re: Lucene 3.6.0 Index Size One thing to keep in mind is that the default merge policy has changed in 3.6 from 2.3.2 (I'm almost certain of that). So it's just a hunch but you may have some unmerged segments left over at the end. Try calling IndexWriter.close(true) after you're done indexing. On Fri, Oct 26, 2012 at 10:50 AM, kiwi clive <kiwi_cl...@yahoo.com> wrote: > Hello. > > We have an index that when creted using lucene2.3.2, has a size of about > 4G. > > Creating the same index (with the same parameters) with lucene 3.6.0 > results in an 11G index. > > Could someone shed some light into why the index is so much larger, given > the same data and the same parameters? > > I realize this is a large version jump but a doubling in index size does > not seem a step in the right direction to me ;-) > > I am using cfs format. > > Thanks, > Clive >