It is also possible to sort by function. This allows you to avoid storing an array of 1 int for all documents. It is slower than the raw Lucene sort.
On Wed, Aug 25, 2010 at 1:46 AM, Toke Eskildsen <t...@statsbiblioteket.dk> wrote: > On Wed, 2010-08-25 at 07:16 +0200, Shelly_Singh wrote: >> I have 1 bln documents to sort. So, that would mean ( 8 bln bytes == 8GB >> RAM) bytes. >> All I have is 8 GB on my machine, so I do not think approach would work. > > This implies that your numeric value can be more than 2 billion. Are you > sure that is true? > > > First suggestion (simple): Ensure that your sort field is stored and > sort by requesting the value for each document in the search result. > This works okay when the number of hits is small. > > Second suggestion (complex): Make an int-array with the sort-order of > your documents. This takes 4GB and needs to be calculated fully before > use, which will take time. After that sorted searches will be very fast > and handle a large number of hits well. > > You can let your indexer maintain the sort-array so that the existing > order ban be re-used when adding documents. Whether modifying an > existing order-array is cheaper than a full re-sort or not depends on > your batch size. > > Regards, > Toke Eskildsen > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > -- Lance Norskog goks...@gmail.com --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org