Re: Custom Sorting

Sandeep Khanzode Wed, 25 Jun 2014 04:35:31 -0700

Hi,

Thanks for your reply. 
Actually, I am evaluating both approaches.

With the sort being performed on a field indexed in Lucene itself, my concern 
is with the FieldCache. Very quickly, for multiple clients executing in 
parallel, it bumps up to 8-10GB. This is for 4-5 different Sort fields using an 
index corpus of 50M documents. The problem is not so much the memory 
consumption, as mush as controlling it. If the max heap argument for the JVM is 
scaled back to 2-3GB, then all clients throw an OOM. How should the FieldCache 
scale based on the amount of available max memory to the JVM or can it be 
selectively turned off, or implement a LRU type of algorithm to purge old 
entries?

Secondly, the the DB approach, yes, it will not perform. However, I just wanted 
to know whether such a custom sort function exists that allows one to write 
their own sort on a field that is not indexed by Lucene.

Thanks again,

-----------------------
Thanks n Regards,
Sandeep Ramesh Khanzode

On Wednesday, June 25, 2014 1:21 AM, Erick Erickson <erickerick...@gmail.com> 
wrote:

I'm a little confused here. Sure, sorting on a number of fields will
increase memory, the basic idea here is that you need to cache all the
sort values (plus support structures) for performance reasons.

If you create your own custom sort that goes out to a DB and gets the
doc, you have to be prepared for
q=*:*&sort=custom_function
Which means you'll have to fetch the value for each and every document
in the index. If this is a DB call, it will NOT perform.

In order to be performant, you'll need to cache the values. Which is
what is being done _for_ you by the FieldCache.

So I think this is really a false path, or an "XY" problem. Why do you
think you need to do this?

Best,
Erick

On Tue, Jun 24, 2014 at 10:31 AM, Sandeep Khanzode
<sandeep_khanz...@yahoo.com.invalid> wrote:
> Hi,
>
> I am trying to implement a sort order for search results in Lucene 4.7.2.
>
> If I want to use data for ordering that is not stored in Lucene as Fields, is 
> there any way this can be done?
> Basically, I would have certain data that is associated logically to a 
> document but stored elsewhere, like a DB. Can I create a Custom Sort function 
> on the lines of a FieldComparator to sort based on this data by plugging it 
> inside the sort function?
>
> I have tested the performance of the Sort function for String and numeric 
> types, and as mentioned in some blog, it seems that the numeric type is much 
> faster compared to the string type. However, if I sort on a number of fields 
> from multiple clients, the memory footprint, due to the FieldCache.DEFAULT 
> impl, increases approximately 5-6 times. If I run this on a machine which 
> does not have this capacity, will I get a OOM or will there be intense 
> thrashing for the memory?
>
>
> -----------------------
> Thanks n Regards,
> Sandeep Ramesh Khanzode

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Custom Sorting

Reply via email to