On Tue, Nov 4, 2014 at 11:44 AM, Shouvik Bardhan <sbard...@gisfederal.com> wrote:
> Thanks for the reply (and thanks for everything else too !!) Mike. You're welcome! > I am unable to understand when to call commit. Should I start counting the > number of documents I am ingesting and then say every 10 million docs do a > commit()? I dont want to do a commit too frequently cause that does not > sound correct. Since the ingest velocity is variable, the best thing would > have been if I could have this commit() happen when say X number of docs > are written. I will try and see if I could find a way to find a good time > to commit. It's really up to you. commit is quite costly, especially for spinning magnets disks, so you should call it rarely. But then it gives you durability, meaning if the OS crashes, computer loses power, JVM crashes or is killed, etc., on startup your index will only reflect the last successful commit, so you want to call commit frequently enough so you don't lose too much data on such events. If the data is transient / you can simply start indexing again on startup, then never call commit :) Mike McCandless http://blog.mikemccandless.com --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org