Re: The best way to iterate over document

2008-03-27 Thread Erick Erickson
See below... On Wed, Mar 26, 2008 at 11:29 AM, Wojtek H <[EMAIL PROTECTED]> wrote: > Thank you for reply. What I did not mention before was that for > iteration we don't care about scoring, so that's not the issue at all. > Creating Filter with BitSet seems much better idea than keeping > HitIter

Re: The best way to iterate over document

2008-03-26 Thread Wojtek H
Thank you for reply. What I did not mention before was that for iteration we don't care about scoring, so that's not the issue at all. Creating Filter with BitSet seems much better idea than keeping HitIterator in memory. Am I right that in such a case with MatchAllDocsQuery memory usage would be a

Re: The best way to iterate over document

2008-03-26 Thread Erick Erickson
Why not keep a Filter in memory? It consists of a single bit per document and the ordinal position of that bit is the Lucene doc ID. You could create this reasonably quickly for the *first* query that came in via HitCollector. Then each time you wanted another chunk, use the filter to know which d