Hmm not good. If you are really only adding documents, you should be using IndexWriter.addDocument, which won't buffer any deleted terms and that method call should be a no-op. It also makes flushes more efficient since all of your indexing buffer goes to the added documents, not buffered delete terms. Are you using updateDocument?
Can you reproduce this slowness on a newer release? There have been performance issues fixed in newer releases in this method, e.g https://issues.apache.org/jira/browse/LUCENE-6161 Have you changed any IndexWriterConfig settings from defaults? What are your unique id fields like? How many bytes in length? Mike McCandless http://blog.mikemccandless.com On Thu, Jul 28, 2016 at 5:01 AM, Bernd Fehling < bernd.fehl...@uni-bielefeld.de> wrote: > While trying to get higher performance for indexing it turned out that > BufferedUpdateStreams is breaking indexing performance. > public synchronized ApplyDeletesResult applyDeletesAndUpdates(...) > > At IndexWriterConfig I have setRAMBufferSizeMB=1024 and the Lucene 4.10.4 > API states: > "Determines the amount of RAM that may be used for buffering added > documents and deletions before they are flushed to the Directory. > Generally for faster indexing performance it's best to flush by RAM > usage instead of document count and use as large a RAM buffer as you can." > > Also setMaxBufferedDocs=-1 and setMaxBufferedDeleteTerms=-1. > > BD 0 [Wed Jul 27 13:42:03 GMT+01:00 2016; Thread-27890]: applyDeletes: > infos=... > BD 0 [Wed Jul 27 14:38:55 GMT+01:00 2016; Thread-27890]: applyDeletes took > 3411845 msec > > About 56 minutes no indexing and only applying deletes. > What is it deleting? > > If the index gets bigger the time gets longer, currently 2.5 hours of > waiting. > I'm adding 96 million docs with uniq id, no duplicates, only add, no > deletes. > > Any suggestions which config is _really_ going for high performance > indexing? > > Best regards, > Bernd > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >