Re: Sorting a Lucene index

Lance Norskog Wed, 25 Aug 2010 20:19:39 -0700

It is also possible to sort by function. This allows you to avoid
storing an array of 1 int for all documents. It is slower than the raw
Lucene sort.


On Wed, Aug 25, 2010 at 1:46 AM, Toke Eskildsen <t...@statsbiblioteket.dk> 
wrote:
> On Wed, 2010-08-25 at 07:16 +0200, Shelly_Singh wrote:
>> I have 1 bln documents to sort. So, that would mean ( 8 bln bytes == 8GB 
>> RAM) bytes.
>> All I have is 8 GB on my machine, so I do not think approach would work.
>
> This implies that your numeric value can be more than 2 billion. Are you
> sure that is true?
>
>
> First suggestion (simple): Ensure that your sort field is stored and
> sort by requesting the value for each document in the search result.
> This works okay when the number of hits is small.
>
> Second suggestion (complex): Make an int-array with the sort-order of
> your documents. This takes 4GB and needs to be calculated fully before
> use, which will take time. After that sorted searches will be very fast
> and handle a large number of hits well.
>
> You can let your indexer maintain the sort-array so that the existing
> order ban be re-used when adding documents. Whether modifying an
> existing order-array is cheaper than a full re-sort or not depends on
> your batch size.
>
> Regards,
> Toke Eskildsen
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>



-- 
Lance Norskog
goks...@gmail.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Sorting a Lucene index

Reply via email to