You could also use Lucene's "search after" capability. It's designed for exactly this use-case (deep paging).
See https://issues.apache.org/jira/browse/LUCENE-2215 Mike McCandless http://blog.mikemccandless.com On Thu, Mar 14, 2013 at 6:03 AM, Toke Eskildsen <t...@statsbiblioteket.dk> wrote: > On Thu, 2013-03-14 at 04:11 +0100, dizh wrote: >> each document has a timestamp identify the time which it is indexed, I >> want search the documents using sort, the sort field is the timestamp, > > [...] > >> but when you do paging, for example in a web app , the user want to go >> to the last 49999980-5000000, well, it is slowly... > > Yes. The problen is that it performs a sliding window search with a > window size of page+topX and that does not work well with 5M entries, > especially not as it used a heap, which work very well for small windows > but horrible for large windows. > >> I have a large number of Log4J logs, and I want to index them and >> present them using web ui. > > I still don't see why you would want to page to 5M, but okay. > > Instead of representing the timestamps directly, convert them to unique > longs when indexing. Guessing that you always have less than 1000 log > entries/ms, your long would be > (timestamp_in_ms << 10) & counter++ > where the counter is set to 0 each time a different timestamp is > encountered. This also ensures that the order of your log entries is > preserved. Let's call the modified timestamps for utime. > > When you do a paginated search for 20 results, keep track of the last > utime. When you request the next page, add a NumericRangeFilter going > from the last utime (non-inclusive) with no upper limit and ask for the > top-20 results again > > > NB: Please get rid of the garbage that follows each of your posts on > this mail list. The Confidentiality Notice has negative value here. > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org