Ours is an index for keeps and also keeps growing for many weeks till we decide to re-ingest again. I am counting docs getting added within my little server and every 5 million docs or so I am calling commit. I will play with the threshold and also bring in an elapsed time till last commit to refine it. I am a little surprised that when the flush happens to the disk (and I can see that this happens often enuf under the hood), there is virtually no pause in the indexing (or searching) but during commit there seems to be a big pause. I will play with it and look at the code and attempt to understand.
thanks for the help, On Tue, Nov 4, 2014 at 12:01 PM, Michael McCandless < luc...@mikemccandless.com> wrote: > On Tue, Nov 4, 2014 at 11:44 AM, Shouvik Bardhan > <sbard...@gisfederal.com> wrote: > > > Thanks for the reply (and thanks for everything else too !!) Mike. > > You're welcome! > > > I am unable to understand when to call commit. Should I start counting > the > > number of documents I am ingesting and then say every 10 million docs do > a > > commit()? I dont want to do a commit too frequently cause that does not > > sound correct. Since the ingest velocity is variable, the best thing > would > > have been if I could have this commit() happen when say X number of docs > > are written. I will try and see if I could find a way to find a good time > > to commit. > > It's really up to you. > > commit is quite costly, especially for spinning magnets disks, so you > should call it rarely. > > But then it gives you durability, meaning if the OS crashes, computer > loses power, JVM crashes or is killed, etc., on startup your index > will only reflect the last successful commit, so you want to call > commit frequently enough so you don't lose too much data on such > events. > > If the data is transient / you can simply start indexing again on > startup, then never call commit :) > > Mike McCandless > > http://blog.mikemccandless.com > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >