UNOFFICIAL Hi Mike,
The hits do seem to come back in docId order. I don't know if they do that every time though. Might be best to sort them. Compiling statistics in the collector sounds like a good idea. I might do that. Thanks, Steve -----Original Message----- From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Monday, 4 November 2013 9:49 PM To: Lucene Users Subject: Re: splitting docIds from a search by segment [SEC=UNOFFICIAL] On Sun, Nov 3, 2013 at 7:59 PM, Stephen GRAY <stephen.g...@immi.gov.au> wrote: > UNOFFICIAL > > Hi Mike, > > I ran it again and this time the two methods came out about the same: 168 - > 288 ms to process 173,000 documents for the walking method and 160 - 205 ms > for the MultiDocValues method . I don't know what was happening with my last > test. Hmm, still curious. But it could simply be that the per-doc binary search is in the noise... > Here is my code: The code looks correct, but are you certain the hits come back in docID order? Are you sorting by (SortField.FIELD_DOC)? > Thanks for the tip on using a custom Collector. This is in Lucene in Action > (great book by the way). I'm glad to hear that, thanks! Another option is to fold this processing (looking up the NDV value for the doc and then doing something) into your Collector: it's already told whenever it's switching to a new reader, so you'd lookup your NDV instance there, and then in collect(int doc), do your processing. Mike McCandless http://blog.mikemccandless.com --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org UNOFFICIAL -------------------------------------------------------------------- Important Notice: If you have received this email by mistake, please advise the sender and delete the message and attachments immediately. This email, including attachments, may contain confidential, sensitive, legally privileged and/or copyright information. Any review, retransmission, dissemination or other use of this information by persons or entities other than the intended recipient is prohibited. DIBP respects your privacy and has obligations under the Privacy Act 1988. The official departmental privacy policy can be viewed on the department's website at www.immi.gov.au. See: http://www.immi.gov.au/functional/privacy.htm --------------------------------------------------------------------- --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org