Chris Hostetter wrote:
this is something anyone using the Lucene API can do as long as they use a
HitCollector ... the Nutch impl seems to ctually spin up a seperate thread

I'm keen to understand the pros and cons of these two approaches.

With the HitCollector approach is this just engineering a fall at the final hurdle? It could be that long running queries spend all their time doing edit-distance comparisions for a a fuzzy boolean query, say or reading TermDocs for a large range filter to create a BitSet only to be aborted at the collection stage? Another point - I noticed in some basic timing tests that calling System.currentTimeMillis() in a tight loop like for *every* call to HitCollector.collect(..) could add reasonable overhead so you probably only want to call this for every nth document collected when testing execution times.

Cheers
Mark


                
___________________________________________________________ Try the all-new Yahoo! Mail. "The New Version is radically easier to use" – The Wall Street Journal http://uk.docs.yahoo.com/nowyoucan.html

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to