Re: index size doubling / optimization (Lucene 3.0.3)

2011-02-11 Thread Uwe Schindler
Hi, That is as expected. When IndexReader or IndexSearcher are open, the snapshot of this index is preserved until you reopen it, as all readers only see the index in the state when it was opened, so disk space is still acquired and on windows you even see the files. For optimize (what you shou

Re: index size doubling / optimization (Lucene 3.0.3)

2011-02-11 Thread Phil Herold
New information: it appears that the index size increasing (not always doubling but going up significantly) occurs when I search the index while building it. Calling indexWriter.optimize(1, true); when I'm done adding documents sometimes reduces the index down to size, but not always. Has anyon

Re: index size doubling / optimization (Lucene 3.0.3)

2011-02-10 Thread Michael McCandless
IndexWriter.setInfoStream -- when you set that, it produces lots of verbose output detailing what IW is doing to the index... Mike On Wed, Feb 9, 2011 at 8:06 PM, Phil Herold wrote: > I didn't have any errors or exceptions. Sorry to be dense, but what exactly > is the "infoStream output" you're

Re: index size doubling / optimization (Lucene 3.0.3)

2011-02-09 Thread Phil Herold
I didn't have any errors or exceptions. Sorry to be dense, but what exactly is the "infoStream output" you're asking about? >This is not expected. > >Did the last IW exit "gracefully"? If so, it should delete the old >segments after swapping in the optimized one. >Can you post infoStre

Re: index size doubling / optimization (Lucene 3.0.3)

2011-02-09 Thread Michael McCandless
This is not expected. Did the last IW exit "gracefully"? If so, it should delete the old segments after swapping in the optimized one. Can you post infoStream output after running optimize? Mike On Wed, Feb 9, 2011 at 1:58 PM, Phil Herold wrote: > I know that the size of a Lucene index can do

index size doubling / optimization (Lucene 3.0.3)

2011-02-09 Thread Phil Herold
I know that the size of a Lucene index can double while optimization is underway, but it's supposed to eventually settle back down to the original size, correct? We have a Lucene index consisting of 100K documents, that is normally about 12GB in size. It is split across 10 sub-indexes which we sear