Hi, Trond,

By the way, it appears to me that Lucene uses the iterator pattern a lot,
like SegmentTermEnum, TermDocs, TermPositions, etc. Each iterator uses the
underlying fix sized buffer to load a chunck of data at a time. So, even you
have millions of documents, you shouldn't run into memory problem given
Lucene's index data structure.

That said, there are some parameters that you can tweak given really large
amount of documents. But I suggest that you don't need to worry about it
unless you really run into it.

Cheers,

Jian

On 10/16/05, Trond Aksel Myklebust <[EMAIL PROTECTED]> wrote:
>
> How is Lucene handling very large queries? I have 6million documents,
> which
> each has a "docID" field. There is a total of 20000 distinct docID's, so
> many documents got the same docID which consists of a filename (only name,
> not path).
>
> Sometimes, I must get all documents that has one of 10 docID's, and
> sometimes I need to get all documents that has one of 10000 docIDs. Is
> there
> any other way than doing a query: docID:(file1 file2 file3 file4..) ?
>
>
>
> Trond A Myklebust
>
>
>
>
>
>

Reply via email to