Re: simultaneous indexing and searching causing intermitently long searches.

Michael McCandless Sun, 05 Apr 2009 05:23:30 -0700

On Sat, Apr 4, 2009 at 5:03 PM, Dan OConnor <docon...@acquiremedia.com> wrote:


> We have watched GC times closely in the past. Most of the results of us 
> trying various settings was to make GC worse instead of better.

OK, but is a slow "stop the world" GC event to blame for the queries
that unexpectedly long to come back?  (vs, "a big merge was running",
vs "the OS had swapped pages out").

> We didn't know about reopen() until recently so we still create a new 
> IndexSearcher in the background and warm it up before we put it into service. 
> We are going to switch to ping-ponging between to IndexSearchers. When we 
> call reopen(), do we have to warm it up again if reopen returns a new 
> IndexReader?

Yes you still need to fully warm after using reopen.  Unfortunately,
in 2.4 reopen() doesn't save all that much on the warming time, but in
2.9 it saves quite a bit when sorting by fields.  Still, it does save
some (eg loading norms) so you should switch to reopen right away.

> After I sent this email, I found that the ConcurrentMergeScheduler has both a 
> setMaxThreadCount and a setMergeThreadPriority. Does this allow our code to 
> tell the JVM to run merge at a lower priority (and perhaps with more threads) 
> than our IndexWriter and IndexSearcher threads (which are created with normal 
> priority)?

Yes, you can set the merge thread priority.  By default it's one tick
higher than your indexer threads, the reasoning being to prevent
starvation of the merge thread when you are doing intense indexing.
Since you are indexing & searching on the same box, you should try
dropping merge thread priority to one tick lower and see if that helps
things.  If it does, please report back because this is a good
tunable.

You can also set the max number of merge threads.  It defaults to 3,
which may be too high (even a single merge thread may saturate your IO
system); you might want to try setting it to 1... if that helps, I'd
love to hear back as well.

> For warm-ups, since we sort on a couple of date fields within the document 
> (in addition to the straight relevance sort), I'm reading your suggestion 
> that it is important to issue warm up queries that date sort as well?

Definitely.  Sorting by field must load the FieldCache entry (under
the hood), which is a costly operation.  With 2.9, after reopen,
loading this field cache will be much less costly because only the new
segments will need to be loaded.  But 2.4 is not as efficient after a
reopen, and will reload the entire FieldCache (ie, for all docs).

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: simultaneous indexing and searching causing intermitently long searches.

Reply via email to