Re: Help running out of files

Charlie Hubbard Mon, 09 Jan 2012 11:44:42 -0800

Ian,

Thanks for you help and patients with me.  I started to look at porting
this to a simple self contained example, and I finally found the error in
my code.  IndexSearcher.close() doesn't close the underlying IndexReader
when using the constructor new IndexSearcher( IndexReader ).  Went back and
re-read the documentation on IndexSearcher.close() and found that important
sentence.  Previously in 2.4 I hadn't used that constructor and closing the
searcher would also close any underlying IndexReaders.  When upgrading to
3.1 the method of construction I was using had been removed and I switched
to passing in the IndexReader.


So as usual it was pilot error.  It was never Lucene's problem just me not
be careful when switching to new methods.  I'm sorry Lucene you are still
my favorite library. :-)

Thanks again.
Charlie

On Mon, Jan 9, 2012 at 12:10 PM, Ian Lea <ian....@gmail.com> wrote:

> Charlie
>
>
> From the FAQ
> http://wiki.apache.org/lucene-java/LuceneFAQ#Does_Lucene_allow_searching_and_indexing_simultaneously.3F
>
> "...  an IndexReader only searches the index as of the "point in time"
> that it was opened. Any updates to the index, either added or deleted
> documents, will not be visible until the IndexReader is re-opened..."
>
> And the changes to the index won't be visible until you call commit()
> or close() on the writer.  So you need to call commit() and then (re)
> open readers.
>
> I don't see how you can get leaked files when you do call close and
> not when you don't.  Can you narrow it down to a simple standalone
> program?
>
>
> --
> Ian.
>
>
> On Mon, Jan 9, 2012 at 3:10 PM, Charlie Hubbard
> <charlie.hubb...@gmail.com> wrote:
> > Ian,
> >
> > From reading the docs it's seems clear all I need to do is call
> > IndexWriter.commit() in order for the changes to my single IndexWriter to
> > be visible to the IndexReader and hence my single IndexSearcher.  When
> you
> > say "you need to close old readers, you need to reopen readers to pick
> > up changes." it sounds like I can't call commit() on the IndexWriter and
> > expect my IndexReaders to see the changes made.  What's wrong with using
> > commit() to see changes?  Are there limitations to this?  Am I not doing
> it
> > right?
> >
> >
> http://lucene.apache.org/java/3_1_0/api/core/org/apache/lucene/index/IndexWriter.html
> >
> > What I am seeing is if I close my IndexSearcher I see leaked files that
> do
> > not exist on the hard disk probably because they were merged.  If I don't
> > call IndexSearcher.close() I don't see any leaked files.  I used this
> code
> > that would close() the IndexWriter and IndexReader in order to make
> changes
> > visible, and it worked great in 2.4.  But I see leaking in 3.1.
> >
> > Charlie
> >
> > On Mon, Jan 9, 2012 at 5:15 AM, Ian Lea <ian....@gmail.com> wrote:
> >
> >> It's hard, impossible for me, to figure out from this what your
> >> problem might be,  Multiple indexes, MultiReader, multiple writers
> >> (?), multiple threads?
> >>
> >> However I can make some statements: Lucene doesn't leak files, you
> >> need to close old readers, you need to reopen readers to pick up
> >> changes.
> >>
> >> Have you looked at the NRT stuff?  There are also classes called
> >> oal.search.NRTManager and oal.search.SearcherManager, now part of the
> >> core, previously available via an LIA download.  I'm not sure they
> >> work with multi readers but could certainly be mined for ideas.
> >>
> >>
> >> --
> >> Ian.
> >>
> >>
> >> On Sat, Jan 7, 2012 at 11:56 PM, Charlie Hubbard
> >> <charlie.hubb...@gmail.com> wrote:
> >> > Ok I think I've fixed my original problem by converting everything to
> use
> >> > commit() and never call close() except when the server shuts down.
>  This
> >> > means I'm not closing my IndexWriter or IndexSearcher after opening
> them.
> >> >  I periodically call commit() on the IndexWriter after indexing my
> >> > documents.  However, my new issue is that the changes made to the
> index
> >> > isn't reflected except the first time commit() is called.  The 2nd,
> 3rd,
> >> > etc calls to commit() never show up when doing searches.  I reread the
> >> API
> >> > docs, and it seems like this should work.
> >> >
> >> > My client code is the following:
> >> >
> >> >   SearchIndex indexer = ...;
> >> >   if( incomingFiles != null ) {
> >> >      for( File incoming : incomingFiles ) {
> >> >         indexer.process( incoming);
> >> >      }
> >> >      indexer.commit();
> >> >  }
> >> >
> >> > Here is the excerpt from my SearchIndex class:
> >> >
> >> > public class SearchIndex {
> >> >    ...
> >> >    protected IndexSearcher getSearcher() throws IOException {
> >> >        synchronized( searchLock ) {
> >> >            if( searcher == null  ) {
> >> >                List<IndexReader> readers = new
> ArrayList<IndexReader>();
> >> >                for( MailDrop drop : store.getMailDrops() ) {
> >> >                    File index = drop.getIndex(indexName);
> >> >                    if( index.exists() ) {
> >> >
> >> > readers.add(IndexReader.open(FSDirectory.open(index), true));
> >> >                    }
> >> >                }
> >> >                if( logger.isDebugEnabled() ) logger.debug("Opening
> >> > searcher: " + indexName );
> >> >                searcher = new IndexSearcher(new MultiReader(
> >> > readers.toArray( new IndexReader[readers.size()]), true ));
> >> >            }
> >> >            return searcher;
> >> >        }
> >> >    }
> >> >
> >> >    public IndexWriter getWriter() throws IOException,
> >> InterruptedException
> >> > {
> >> >        synchronized( writerLock ) {
> >> >            if( writer == null ) {
> >> >                if( reader != null ) { // is someone currently
> deleting?
> >> >  Then wait();
> >> >                    writerLock.wait();
> >> >                }
> >> >                MailDrop mailDrop = store.getActiveMailDrop();
> >> >                if( mailDrop == null ) return null;
> >> >                writer = createWriter( mailDrop.getIndex(indexName) );
> >> >            }
> >> >            return writer;
> >> >        }
> >> >    }
> >> >
> >> >    private IndexWriter createWriter(File index) throws IOException {
> >> >        StandardAnalyzer analyzer = new
> >> StandardAnalyzer(Version.LUCENE_31);
> >> >
> >> >        IndexWriterConfig config = new
> >> IndexWriterConfig(Version.LUCENE_31,
> >> > analyzer);
> >> >        if( mergeFactor != null ) {
> >> >            ((LogMergePolicy)config.getMergePolicy()).setMergeFactor(
> >> > mergeFactor );
> >> >        }
> >> >
> >> >        if( logger.isDebugEnabled() ) {
> >> >            logger.debug("Opening writer for " +
> index.getAbsolutePath()
> >> );
> >> >        }
> >> >        IndexWriter indexWriter = new
> IndexWriter(FSDirectory.open(index),
> >> > config );
> >> >        return indexWriter;
> >> >    }
> >> >
> >> >    public IndexReader getReader() throws IOException,
> >> InterruptedException
> >> > {
> >> >        synchronized( writerLock ) {
> >> >            if( reader == null ) {
> >> >                if( writer != null ) { // is someone currently writing?
> >> >  Then wait();
> >> >                    writerLock.wait();
> >> >                }
> >> >                List<IndexReader> readers = new ArrayList<IndexReader>(
> >> > store.getMailDrops().size() );
> >> >                for( MailDrop drop : store.getMailDrops() ) {
> >> >                    File index = drop.getIndex(indexName);
> >> >                    if( !drop.isReadOnly() && index.exists() ) {
> >> >                        readers.add( IndexReader.open(
> >> > FSDirectory.open(index), true ) );
> >> >                    }
> >> >                }
> >> >                reader = new MultiReader( readers.toArray( new
> >> > IndexReader[readers.size()]) );
> >> >            }
> >> >            return reader;
> >> >        }
> >> >    }
> >> >
> >> >    public boolean isIndexWritable() {
> >> >        for( MailDrop drop : store.getMailDrops() ) {
> >> >            File index = drop.getIndex(indexName);
> >> >            if( !drop.isReadOnly() && index.exists() ) {
> >> >                return true;
> >> >            }
> >> >        }
> >> >        return false;
> >> >    }
> >> >
> >> >    public void close() throws IOException {
> >> >        closeWriter();
> >> >        closeSearcher();
> >> >    }
> >> >
> >> >    public void commit() throws IOException {
> >> >        try {
> >> >            if( writer != null ) {
> >> >                if( logger.isDebugEnabled() ) logger.debug("Committing
> >> > changes to the index " + indexName );
> >> >                writer.commit();
> >> >            }
> >> >        } catch( OutOfMemoryError e ) {
> >> >            logger.error("Out of memory while committing index.  Index
> >> will
> >> > be closed: " + indexName, e);
> >> >            closeWriter();
> >> >        }
> >> >    }
> >> >
> >> >    private void closeWriter() throws IOException {
> >> >        synchronized( writerLock ) {
> >> >            try {
> >> >                if( writer != null ) {
> >> >                    logger.debug("Closing the writer for " +
> indexName);
> >> >                    writer.close();
> >> >                }
> >> >                writerLock.notifyAll();
> >> >            } finally {
> >> >                writer = null;
> >> >            }
> >> >        }
> >> >    }
> >> >
> >> >    private void closeSearcher() throws IOException {
> >> >        synchronized( searchLock ) {
> >> >            if( searcher != null && activeSearches.get() == 0 ) {
> >> >                try {
> >> >                    logger.debug( "Closing the searcher for " +
> indexName
> >> );
> >> >                    searcher.close();
> >> >                } finally {
> >> >                    searcher = null;
> >> >                }
> >> >            }
> >> >            searchLock.notifyAll();
> >> >        }
> >> >    }
> >> >
> >> >    public void closeReader() throws IOException, InterruptedException
> {
> >> >        synchronized( writerLock ) {
> >> >            if( reader != null ) {
> >> >                try {
> >> >                    reader.close();
> >> >                } finally {
> >> >                    reader = null;
> >> >                }
> >> >                optimize();
> >> >            }
> >> >            writerLock.notifyAll();
> >> >        }
> >> >    }
> >> > }
> >> >
> >> > Just so you are aware of what my problem was which I don't think I was
> >> > doing anything incorrectly here.  I had some code in getSearcher()
> doing
> >> > the following:
> >> >
> >> >    protected IndexSearcher getSearcher() throws IOException {
> >> >        synchronized( searchLock ) {
> >> >            if( searcher == null || ( activeSearches.get() == 0 &&
> >> > !searcher.getIndexReader().isCurrent() ) ) {   <<<<<< problem
> >> >                if( searcher != null ) {
> >> >                    logger.debug("Closing the searcher for " +
> indexName);
> >> >                    searcher.close();
> >> >                }
> >> >                List<IndexReader> readers = new
> ArrayList<IndexReader>();
> >> >                for( MailDrop drop : store.getMailDrops() ) {
> >> >                    File index = drop.getIndex(indexName);
> >> >                    if( index.exists() ) {
> >> >
> >> > readers.add(IndexReader.open(FSDirectory.open(index), true));
> >> >                    }
> >> >                }
> >> >                searcher = new IndexSearcher(new MultiReader(
> >> > readers.toArray( new IndexReader[readers.size()]), true ));
> >> >                searcherTimestamp = System.currentTimeMillis();
> >> >            }
> >> >            return searcher;
> >> >        }
> >> >    }
> >> >
> >> > That check isCurrentReader() would sometimes say the Index was out of
> >> date,
> >> > so it closed it, and reopened it.  I changed it to only use commit()
> and
> >> > never close it and it doesn't leak files.  But, I'm not seeing the
> >> changes.
> >> >  What I don't understand is why calling close() would leak files.
>  I've
> >> > double checked that close() was definitely being called with my logs.
> >> >  Again this code was the same in 2.4 and it didn't leak files, but
> under
> >> > 3.1 it leaked files.
> >> >
> >> > Thanks for you help,
> >> >
> >> > Charlie
> >> >
> >> > On Fri, Jan 6, 2012 at 3:06 PM, Ian Lea <ian....@gmail.com> wrote:
> >> >
> >> >> Something that did change at some point, can't remember when, was the
> >> >> way that discarded but not explicitly closed searchers/readers are
> >> >> handled.  I think that they used to get garbage collected, causing
> >> >> open files to be closed, but now need to be explicitly closed.
>  Sounds
> >> >> to me like you are opening new searchers/readers without closing old
> >> >> ones.
> >> >>
> >> >>
> >> >> --
> >> >> Ian.
> >> >>
> >> >>
> >> >> On Fri, Jan 6, 2012 at 6:50 PM, Erick Erickson <
> erickerick...@gmail.com
> >> >
> >> >> wrote:
> >> >> > Can you show the code? In particular are you re-opening
> >> >> > the index writer?
> >> >> >
> >> >> > Bottom line: This isn't a problem anyone expects
> >> >> > in 3.1 absent some programming error on your
> >> >> > part, so it's hard to know what to say without
> >> >> > more information.
> >> >> >
> >> >> > 3.1 has other problems if you use spellcheck.collate,
> >> >> > you might want to upgrade if you use that feature
> >> >> > to at least 3.3. But I truly believe that this is irrelevant
> >> >> > to your problem.
> >> >> >
> >> >> > Best
> >> >> > Erick
> >> >> >
> >> >> >
> >> >> > On Fri, Jan 6, 2012 at 1:25 PM, Charlie Hubbard
> >> >> > <charlie.hubb...@gmail.com> wrote:
> >> >> >> Thanks for the reply.  I'm still having trouble.  I've made some
> >> >> changes to
> >> >> >> use commit over close, but I'm not seeing much in terms of
> changes on
> >> >> what
> >> >> >> seems like ever increasing open file handles.  I'm developing on
> Mac
> >> OS
> >> >> X
> >> >> >> 10.6 and testing on Linux CentOS 4.5.  My biggest problem is I
> can't
> >> >> tell
> >> >> >> why lsof is saying this process has this many open files.  I'm
> seeing
> >> >> >> repeated files being opened more than once, and I'm seeing files
> >> >> showing up
> >> >> >> in lsof output that don't exist on the file system.  For example
> >> here is
> >> >> >> the lucene directory:
> >> >> >>
> >> >> >> -rw-r--r-- 1 root root  328396   Jan   5 20:21      _ly.fdt
> >> >> >>> -rw-r--r-- 1 root root    6284    Jan    5 20:21      _ly.fdx
> >> >> >>> -rw-r--r-- 1 root root    2253    Jan    5 20:21      _ly.fnm
> >> >> >>> -rw-r--r-- 1 root root  234489  Jan  5 20:21         _ly.frq
> >> >> >>> -rw-r--r-- 1 root root   15704   Jan   5 20:21        _ly.nrm
> >> >> >>> -rw-r--r-- 1 root root 1113954 Jan  5 20:21         _ly.prx
> >> >> >>> -rw-r--r-- 1 root root    5421 Jan    5 20:21          _ly.tii
> >> >> >>> -rw-r--r-- 1 root root  445988 Jan   5 20:21          _ly.tis
> >> >> >>> -rw-r--r-- 1 root root  118262 Jan   6 09:56          _nx.cfs
> >> >> >>> -rw-r--r-- 1 root root   10009 Jan   6 10:00           _ny.cfs
> >> >> >>> -rw-r--r-- 1 root root      20 Jan     6 10:00
> >> segments.gen
> >> >> >>> -rw-r--r-- 1 root root     716 Jan    6 10:00
> segments_kw
> >> >> >>
> >> >> >>
> >> >> >> And here is an excerpt from: lsof -p 19422 | awk -- '{print $9}' |
> >> sort
> >> >> >>
> >> >> >> ...
> >> >> >> /usr/local/emailarchive/mailarchive/lucene/indexes/mail/_lp.cfs
> >> >> >> /usr/local/emailarchive/mailarchive/lucene/indexes/mail/_lp.cfs
> >> >> >> /usr/local/emailarchive/mailarchive/lucene/indexes/mail/_lp.cfs
> >> >> >> /usr/local/emailarchive/mailarchive/lucene/indexes/mail/_lp.cfs
> >> >> >> /usr/local/emailarchive/mailarchive/lucene/indexes/mail/_lp.cfs
> >> >> >> /usr/local/emailarchive/mailarchive/lucene/indexes/mail/_lp.cfs
> >> >> >> /usr/local/emailarchive/mailarchive/lucene/indexes/mail/_lp.cfs
> >> >> >> /usr/local/emailarchive/mailarchive/lucene/indexes/mail/_lp.cfs
> >> >> >> /usr/local/emailarchive/mailarchive/lucene/indexes/mail/_lq.cfs
> >> >> >> /usr/local/emailarchive/mailarchive/lucene/indexes/mail/_lq.cfs
> >> >> >> /usr/local/emailarchive/mailarchive/lucene/indexes/mail/_lq.cfs
> >> >> >> /usr/local/emailarchive/mailarchive/lucene/indexes/mail/_lq.cfs
> >> >> >> /usr/local/emailarchive/mailarchive/lucene/indexes/mail/_lq.cfs
> >> >> >> /usr/local/emailarchive/mailarchive/lucene/indexes/mail/_lq.cfs
> >> >> >> /usr/local/emailarchive/mailarchive/lucene/indexes/mail/_lq.cfs
> >> >> >> /usr/local/emailarchive/mailarchive/lucene/indexes/mail/_lr.cfs
> >> >> >> /usr/local/emailarchive/mailarchive/lucene/indexes/mail/_lr.cfs
> >> >> >> /usr/local/emailarchive/mailarchive/lucene/indexes/mail/_lr.cfs
> >> >> >> /usr/local/emailarchive/mailarchive/lucene/indexes/mail/_lr.cfs
> >> >> >> /usr/local/emailarchive/mailarchive/lucene/indexes/mail/_lr.cfs
> >> >> >> /usr/local/emailarchive/mailarchive/lucene/indexes/mail/_lr.cfs
> >> >> >> /usr/local/emailarchive/mailarchive/lucene/indexes/mail/_ls.cfs
> >> >> >> /usr/local/emailarchive/mailarchive/lucene/indexes/mail/_ls.cfs
> >> >> >> /usr/local/emailarchive/mailarchive/lucene/indexes/mail/_ls.cfs
> >> >> >> /usr/local/emailarchive/mailarchive/lucene/indexes/mail/_ls.cfs
> >> >> >> /usr/local/emailarchive/mailarchive/lucene/indexes/mail/_ls.cfs
> >> >> >> /usr/local/emailarchive/mailarchive/lucene/indexes/mail/_lt.cfs
> >> >> >> /usr/local/emailarchive/mailarchive/lucene/indexes/mail/_lt.cfs
> >> >> >> /usr/local/emailarchive/mailarchive/lucene/indexes/mail/_lt.cfs
> >> >> >> /usr/local/emailarchive/mailarchive/lucene/indexes/mail/_lt.cfs
> >> >> >> ...
> >> >> >>
> >> >> >> As you can see none of those files actually exist.  Not only that,
> >> but
> >> >> they
> >> >> >> are opened 8 or 9 times.  There are tons of these non-existant
> >> >> repeatedly
> >> >> >> open files in the output.  So why are the handles being counted as
> >> being
> >> >> >> open?
> >> >> >>
> >> >> >> I have a single IndexWriter and a single IndexSearcher open on a
> >> single
> >> >> CFS
> >> >> >> directory.  The writer is only used by a single thread, but
> >> >> IndexSearcher
> >> >> >> can be shared among several threads.  I still think something has
> >> >> changed
> >> >> >> in 3.1 that's causing this.  I hope you can help me understand how
> >> it's
> >> >> not.
> >> >> >>
> >> >> >> Charlie
> >> >> >>
> >> >> >> On Mon, Jan 2, 2012 at 3:03 PM, Simon Willnauer <
> >> >> >> simon.willna...@googlemail.com> wrote:
> >> >> >>
> >> >> >>> hey charlie,
> >> >> >>>
> >> >> >>> there are a couple of wrong assumptions in your last email mostly
> >> >> >>> related to merging. mergefactor = 10 doesn't mean that you are
> >> ending
> >> >> >>> up with one file neither is it related to files. Yet, my first
> guess
> >> >> >>> is that you are using CompoundFileSystem (CFS) so each segment
> >> >> >>> corresponds to a single file. The merge factor relates to
> segments
> >> and
> >> >> >>> is responsible for triggering segment merges by their  size
> (either
> >> in
> >> >> >>> bytes or in documents). For more details see this blog:
> >> >> >>>
> >> >> >>>
> >> >>
> >>
> http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html
> >> >> >>>
> >> >> >>> If you are using CFS one segment is one file. In 3.1 CFS is only
> >> used
> >> >> >>> if the target segment is less than the nonCFSRatio. That prevents
> >> the
> >> >> >>> usage of CFS for segments that are bigger than a fraction of the
> >> >> >>> existing index to be packed into CFS (by default 0.1 -> 10%)
> >> >> >>>
> >> >> >>> this means your index might create non-cfs segments with multiple
> >> >> >>> files (10 in the worst case.... maybe I missed one but anyway...)
> >> >> >>> which means the number of open files increases.
> >> >> >>>
> >> >> >>> This is only a guess since I don't know what you are doing with
> your
> >> >> >>> index readers etc. Which platform are you one and what is the
> file
> >> >> >>> descriptor limit? In general its ok to raise the FD limit on
> your OS
> >> >> >>> and just let lucene do its job. if you are restricted in any way
> you
> >> >> >>> can set the LogMergePolicy#setNoCFSRatio(double) to 1.0 and see
> you
> >> >> >>> your are still seeing the problem.
> >> >> >>>
> >> >> >>> About commit vs. close - in general its not a good idea to close
> >> your
> >> >> >>> IW at all. I'd keep it open as long as you can and commit if
> needed.
> >> >> >>> Even optimize is somewhat overrated and should be used with care
> or
> >> >> >>> not at all... (here is another writeup regarding optimize:
> >> >> >>>
> >> >> >>>
> >> >>
> >>
> http://www.searchworkings.org/blog/-/blogs/simon-says%3A-optimize-is-bad-for-you
> >> >> >>> )
> >> >> >>>
> >> >> >>>
> >> >> >>> hope that helps,
> >> >> >>>
> >> >> >>> simon
> >> >> >>>
> >> >> >>>
> >> >> >>> On Mon, Jan 2, 2012 at 5:38 PM, Charlie Hubbard
> >> >> >>> <charlie.hubb...@gmail.com> wrote:
> >> >> >>> > I'm beginning to think there is an issue with 3.1 that's
> causing
> >> >> this.
> >> >> >>> >  After looking over my code again I forgot that the mechanism
> that
> >> >> does
> >> >> >>> the
> >> >> >>> > indexing hasn't changed, and the index IS being closed between
> >> >> cycles.
> >> >> >>> >  Even when using push vs pull.  This code used to work on 2.x
> >> lucene,
> >> >> >>> but I
> >> >> >>> > had to upgrade it.  It had been very stable under 2.x, but
> after
> >> >> >>> upgrading
> >> >> >>> > to 3.1 I've started seeing this problem.  I double checked the
> >> code
> >> >> doing
> >> >> >>> > the indexing, and it hasn't changed since I upgraded to 3.1.
>  So
> >> the
> >> >> >>> > constant in this equation is mostly my code.  What's different
> is
> >> >> 3.1.
> >> >> >>> >  Furthermore, when new documents are pulled in through the
> >> >> >>> > old mechanism the open file count continues to rise.  Over a 24
> >> hours
> >> >> >>> > period it's grown by +296 files, but only 10 or 12 documents
> >> indexed.
> >> >> >>> >
> >> >> >>> > So is this a known issue?  Should I upgrade to newer version to
> >> fix
> >> >> this?
> >> >> >>> >
> >> >> >>> > Thanks
> >> >> >>> > Charlie
> >> >> >>> >
> >> >> >>> > On Sat, Dec 31, 2011 at 1:01 AM, Charlie Hubbard
> >> >> >>> > <charlie.hubb...@gmail.com>wrote:
> >> >> >>> >
> >> >> >>> >> I have a program I recently converted from a pull scheme to a
> >> push
> >> >> >>> scheme.
> >> >> >>> >>  So previously I was pulling down the documents I was
> indexing,
> >> and
> >> >> >>> when I
> >> >> >>> >> was done I'd close the IndexWriter at the end of each
> iteration.
> >> >>  Now
> >> >> >>> that
> >> >> >>> >> I've converted to a push scheme I'm sent the documents to
> index,
> >> >> and I
> >> >> >>> >> write them.  However, this means I'm not closing the
> IndexWriter
> >> >> since
> >> >> >>> >> closing after every document would have poor performance.
> >>  Instead
> >> >> I'm
> >> >> >>> >> keeping the IndexWriter open all the time.  Problem is after a
> >> >> while the
> >> >> >>> >> number of open files continues to rise.  I've set the
> following
> >> >> >>> parameters
> >> >> >>> >> on the IndexWriter:
> >> >> >>> >>
> >> >> >>> >> merge.factor=10
> >> >> >>> >> max.buffered.docs=1000
> >> >> >>> >>
> >> >> >>> >> After going over the api docs I thought this would mean it'd
> >> never
> >> >> >>> create
> >> >> >>> >> more than 10 files before merging those files into a single
> file,
> >> >> but
> >> >> >>> it's
> >> >> >>> >> creating 100's of files.  Since I'm not closing the
> IndexWriter
> >> >> will it
> >> >> >>> >> merge the files?  From reading the API docs it sounded like
> >> merging
> >> >> >>> happens
> >> >> >>> >> regardless of flushing, commit, or close.  Is that true?  I've
> >> >> measured
> >> >> >>> the
> >> >> >>> >> files that are increasing, and it's files associated with this
> >> one
> >> >> index
> >> >> >>> >> I'm leaving open.  I have another index that I do close
> >> >> periodically,
> >> >> >>> and
> >> >> >>> >> its not growing like this one.
> >> >> >>> >>
> >> >> >>> >> I've read some posts about using commit() instead of close()
> in
> >> >> >>> situations
> >> >> >>> >> like this because its faster performance.  However, commit()
> just
> >> >> >>> flushes
> >> >> >>> >> to disk rather than flushing and optimizing like close().  Not
> >> sure
> >> >> >>> >> commit() is what I need or not.  Any suggestions?
> >> >> >>> >>
> >> >> >>> >> Thanks
> >> >> >>> >> Charlie
> >> >> >>> >>
> >> >> >>>
> >> >> >>>
> >> ---------------------------------------------------------------------
> >> >> >>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> >> >> >>> For additional commands, e-mail:
> java-user-h...@lucene.apache.org
> >> >> >>>
> >> >> >>>
> >> >> >
> >> >> >
> ---------------------------------------------------------------------
> >> >> > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> >> >> > For additional commands, e-mail: java-user-h...@lucene.apache.org
> >> >> >
> >> >>
> >> >> ---------------------------------------------------------------------
> >> >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> >> >> For additional commands, e-mail: java-user-h...@lucene.apache.org
> >> >>
> >> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> >> For additional commands, e-mail: java-user-h...@lucene.apache.org
> >>
> >>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>

Re: Help running out of files

Reply via email to