See below: On Sun, Jan 16, 2011 at 10:15 AM, sol myr <solmy...@yahoo.com> wrote:
> Hi, > > I'm trying to understand the behavior of file merging / optimization. > I see that whenever my IndexWriter calls 'commit()', it creates a new file > (or fileS). > I also see these files merged when calling 'optimize()' , as much as > allowed by the parameter 'NoCFSRatio' . > > But I'm still trying to figure out: > > 1) Will my writer still perform some file merging, even if I don't > explicitly call 'optimize()'? > > Yes. The merge factor controls this so you don't have a huge number of files. There are some nifty diagrams floating around on the net, but I don't have one right at hand... > 2) Is there a way to configure the number or files, or their size? > > IndexWriter.setMergeFactor controls the number of segments. There's no way I know of to control by size however. > 3) I always keep an open IndexSearcher (and IndexReader). I know they > should be re-opened when a change occurs, but it's not crucial to see > changes immediately, so I just poll periodically, and it might be a few > minutes before my reader is re-opened and allowed to see changes. > But will this approach disturb the writer's ability to optimize / merge > files? If a reader is open, will it prevent file merging? > > No, this is a fine approach. Lucene index segments are never changed. A merge will #copy# the segments being merged to a new segment and when you open a new reader it will look at the new segment while the old reader merrily looks at the old segments. This is why the disk space may double during a merge. Best Erick > Thanks > > > > >