Re: timing merges

2014-06-12 Thread Erick Erickson
Ah, OK. Ignore me then and listen to Mike. On Thu, Jun 12, 2014 at 7:54 AM, Jamie wrote: > Erick > > We are not using Solr. We are using the latest version of Lucene directly. > When I run it in a profiler, I can see all indexing threads blocked on merge > for long stretches at a time. > > Re

Re: timing merges

2014-06-12 Thread Jamie
Erick We are not using Solr. We are using the latest version of Lucene directly. When I run it in a profiler, I can see all indexing threads blocked on merge for long stretches at a time. Regards Jamie On 2014/06/12, 4:39 PM, Erick Erickson wrote: What version of Solr/Lucene? Merging is sup

Re: timing merges

2014-06-12 Thread Michael McCandless
1000 is way too high because it will mean your index has 1000s of segments and when a merge does run it will take a very long time. It's better to do smaller more frequent merges. Try setting segmentsPerTier to 5. It's possible you are hitting too big a merge backlog, in which case the default Me

Re: timing merges

2014-06-12 Thread Erick Erickson
What version of Solr/Lucene? Merging is supposed to be happening in the background for quite a while, so I'd be surprised if this was really the culprit unless you're on an older version of Lucene. See: http://blog.trifork.com/2011/04/01/gimme-all-resources-you-have-i-can-use-them/ But this is ex

Re: timing merges

2014-06-12 Thread Jamie
Erick Well, I have users complaining about it. They say indexing stops for a long time. Currently, the following settings are applied. TieredMergePolicy logMergePolicy = new TieredMergePolicy(); logMergePolicy.setSegmentsPerTier(1000); conf.setMergePolicy(logMergePolicy); What's a good way

Re: timing merges

2014-06-12 Thread Erick Erickson
Michael is, of course, the Master of Merges... I have to ask, though, have you demonstrated to your satisfaction that you're actually seeing a problem? And that fewer merges would actually address that problem? 'cause this might be an "XY" problem Best, Erick On Thu, Jun 12, 2014 at 4:11 AM

Re: timing merges

2014-06-12 Thread Michael McCandless
Likely you should implement a custom MergeScheduler (MergePolicy picks which merges to do, and MergeScheduler schedules them). Or you could e.g. make a MergePolicy that picks only "easy-ish" merges during busy times and leaves hard merges for later. Just be very careful: if merges fall behind and