Chris Hostetter wrote:
this is something anyone using the Lucene API can do as long as they use a
HitCollector ... the Nutch impl seems to ctually spin up a seperate thread
I'm keen to understand the pros and cons of these two approaches.
With the HitCollector approach is this just engineering a fall at the
final hurdle? It could be that long running queries spend all their time
doing edit-distance comparisions for a a fuzzy boolean query, say or
reading TermDocs for a large range filter to create a BitSet only to be
aborted at the collection stage?
Another point - I noticed in some basic timing tests that calling
System.currentTimeMillis() in a tight loop like for *every* call to
HitCollector.collect(..) could add reasonable overhead so you probably
only want to call this for every nth document collected when testing
execution times.
Cheers
Mark
___________________________________________________________
Try the all-new Yahoo! Mail. "The New Version is radically easier to use" The Wall Street Journal
http://uk.docs.yahoo.com/nowyoucan.html
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]