On Thu, Sep 25, 2008 at 1:39 PM, David Lee <[EMAIL PROTECTED]> wrote:
> I was wondering when lucene queries two or more terms, does that mean the
> time it takes will be twice as long? For example if I search +lucene
> +apache, then does lucene get all the documents that match 'lucene' and all
> the documents that match 'apache', and then combine them together? Or can it
> limit the amount of things it needs to retrieve from the index for 'apache'
> based on what it has already retrieved for 'lucene'?

Closer to the latter.  Lucene evaluates all clauses in parallel.  A
document id iterator is created for each term, and they are all
"skipped" to the highest id yet seen until all iterators are on the
same id, which yields a match.  So an AND query of a rare term and a
common term can be faster than the common term alone.

> Is there documentation on how queries work in lucene in regards to how it
> deals with the actual index files?

TermScorer and ConjunctionScorer are the classes you would want to look at.

-Yonik

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to