Re: lucene index reader performance

2016-07-07 Thread Michael McCandless
Somehow you need to get the sorting server-side ... that's really the only way to do your use case efficiently. Why can't you sort each request to your N shards, and then do a merge sort on the client side, to get the top hits? Mike McCandless http://blog.mikemccandless.com On Thu, Jul 7, 2016

Re: lucene index reader performance

2016-07-07 Thread Tarun Kumar
Any suggestions pls? On Mon, Jul 4, 2016 at 3:37 PM, Tarun Kumar wrote: > Hey Michael, > > docIds from multiple indices (from multiple machines) need to be > aggregated, sorted and first few thousand new to be queried. These few > thousand docs can be distributed among multiple machines. Each ma

Re: lucene index reader performance

2016-07-04 Thread Tarun Kumar
Hey Michael, docIds from multiple indices (from multiple machines) need to be aggregated, sorted and first few thousand new to be queried. These few thousand docs can be distributed among multiple machines. Each machine will search the docs which are there in their own indices. So, pulling sorting

Re: lucene index reader performance

2016-07-04 Thread Michael McCandless
Why not ask Lucene to do the sort on your time field, instead of pulling millions of docids to the client and having it sort. You could even do index-time sorting by time field if you want, which makes early termination possible (faster sorted searches). But if even on having Lucene do the sort y

Re: lucene index reader performance

2016-07-03 Thread Tarun Kumar
Thanks for reply Michael! In my application, i need to get millions of documents per search. Use case is following: return documents in increasing order of field time. Client (caller) can't hold more than a few thousand docs at a time so it gets all docIds and corresponding time field for each doc

Re: lucene index reader performance

2016-06-28 Thread Michael McCandless
Are you maybe trying to load too many documents for each search request? The IR.document API is designed to be used to load just a few hits, like a page worth or ~ 10 documents, per search. Mike McCandless http://blog.mikemccandless.com On Tue, Jun 28, 2016 at 7:05 AM, Tarun Kumar wrote: > I