On Thu, Sep 25, 2008 at 1:39 PM, David Lee <[EMAIL PROTECTED]> wrote: > I was wondering when lucene queries two or more terms, does that mean the > time it takes will be twice as long? For example if I search +lucene > +apache, then does lucene get all the documents that match 'lucene' and all > the documents that match 'apache', and then combine them together? Or can it > limit the amount of things it needs to retrieve from the index for 'apache' > based on what it has already retrieved for 'lucene'?
Closer to the latter. Lucene evaluates all clauses in parallel. A document id iterator is created for each term, and they are all "skipped" to the highest id yet seen until all iterators are on the same id, which yields a match. So an AND query of a rare term and a common term can be faster than the common term alone. > Is there documentation on how queries work in lucene in regards to how it > deals with the actual index files? TermScorer and ConjunctionScorer are the classes you would want to look at. -Yonik --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]