Re: search timeout

Andrzej Bialecki Sat, 17 Mar 2007 13:43:16 -0800

markharw00d wrote:

Chris Hostetter wrote:
this is something anyone using the Lucene API can do as long as theyuse aHitCollector ... the Nutch impl seems to ctually spin up a seperatethread
I'm keen to understand the pros and cons of these two approaches.
With the HitCollector approach is this just engineering a fall at thefinal hurdle? It could be that long running queries spend all theirtime doing edit-distance comparisions for a a fuzzy boolean query,say or reading TermDocs for a large range filter to create a BitSetonly to be aborted at the collection stage?Another point - I noticed in some basic timing tests that callingSystem.currentTimeMillis() in a tight loop like for *every* call toHitCollector.collect(..) could add reasonable overhead so you probablyonly want to call this for every nth document collected when testingexecution times.


That's why Nutch implementation doesn't do this (I know, I wrote it ;) ).

What it does is the following (please see the patch for details):

* it creates a single (static) timer thread, which counts the "ticks",every couple hundred ms (configurable). It uses a volatile int counter,therefore avoiding the need to synchronize.

* each HitColector records the start tick count in its constructor, andthen checks the current tick count in collect(...). If the difference istoo large then it throws a RuntimeException (NOTE: would someone*please* refactor this API so that we can exit this loop more gracefully!).

This design has several benefits: it avoids creating too many timerthreads (there is just one per JVM), it avoids the need to synchronizeon the value being changed, and it avoids callingSystem.currentTimeMillis().


Best regards,
Andrzej

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: search timeout

Reply via email to