Thanks for the quick reply! >What do you mean by "Lucene complain about too-many uncommitted docs"?
--> good question, I was thoughtlessly echoing words from my colleague. I asked him and he said that it was about taking very long to commit and memory issues. So maybe this wasn't the best opening statement :) For the other part of the question: we need users to see the changed documents immediately, but I think we have this covered by using NRT Readers and the SearcherManager. Am I correct to conclude calling commit() is not necessary for finding recently changed documents? I think we can then switch to a time based commit() where we just call commit every 5 minutes, in effect losing a maximum of 5 minutes of work (which we can mitigate in another way) when the server somehow stops working. Thank you, -Rob On Wed, Nov 30, 2016 at 3:17 PM, Michael McCandless < luc...@mikemccandless.com> wrote: > What do you mean by "Lucene complain about too-many uncommitted docs"? > Lucene does not really care how frequently you commit... > > How frequently you commit is really your choice, i.e. what risk you > see of power loss / OS crash vs the cost (not just in CPU/IO work for > the computer, but in the users not seeing the recently indexed > documents for a while) of replaying those documents since the last > commit when power comes back. > > Pushing durability back into the queue/channel can be a nice option > too, e.g. Kafka, so that your application doesn't need to keep track > of which docs were not yet committed. > > Mike McCandless > > http://blog.mikemccandless.com > > > On Wed, Nov 30, 2016 at 8:50 AM, Rob Audenaerde > <rob.audenae...@gmail.com> wrote: > > Hi all, > > > > Currently we call commit() many times on our index (about 5M docs, where > > some 10.000-100.000 modifications during the day). The commit times > > typically get more expensive when the index grows, up to several seconds, > > so we want to reduce the number of calls. > > > > (Historically, we had Lucene complain about too-many uncommitted docs > > sometimes, so we went with the commit often approach.) > > > > What is a good strategy for calling commit? Fixed frequency? After X > docs? > > Combination? > > > > I'm curious what is considered 'industry-standard'. Can you share some of > > your expercience? > > > > Thanks! > > > > -Rob >