Hi, Thanks for your reply. Actually, I am evaluating both approaches.
With the sort being performed on a field indexed in Lucene itself, my concern is with the FieldCache. Very quickly, for multiple clients executing in parallel, it bumps up to 8-10GB. This is for 4-5 different Sort fields using an index corpus of 50M documents. The problem is not so much the memory consumption, as mush as controlling it. If the max heap argument for the JVM is scaled back to 2-3GB, then all clients throw an OOM. How should the FieldCache scale based on the amount of available max memory to the JVM or can it be selectively turned off, or implement a LRU type of algorithm to purge old entries? Secondly, the the DB approach, yes, it will not perform. However, I just wanted to know whether such a custom sort function exists that allows one to write their own sort on a field that is not indexed by Lucene. Thanks again, ----------------------- Thanks n Regards, Sandeep Ramesh Khanzode On Wednesday, June 25, 2014 1:21 AM, Erick Erickson <erickerick...@gmail.com> wrote: I'm a little confused here. Sure, sorting on a number of fields will increase memory, the basic idea here is that you need to cache all the sort values (plus support structures) for performance reasons. If you create your own custom sort that goes out to a DB and gets the doc, you have to be prepared for q=*:*&sort=custom_function Which means you'll have to fetch the value for each and every document in the index. If this is a DB call, it will NOT perform. In order to be performant, you'll need to cache the values. Which is what is being done _for_ you by the FieldCache. So I think this is really a false path, or an "XY" problem. Why do you think you need to do this? Best, Erick On Tue, Jun 24, 2014 at 10:31 AM, Sandeep Khanzode <sandeep_khanz...@yahoo.com.invalid> wrote: > Hi, > > I am trying to implement a sort order for search results in Lucene 4.7.2. > > If I want to use data for ordering that is not stored in Lucene as Fields, is > there any way this can be done? > Basically, I would have certain data that is associated logically to a > document but stored elsewhere, like a DB. Can I create a Custom Sort function > on the lines of a FieldComparator to sort based on this data by plugging it > inside the sort function? > > I have tested the performance of the Sort function for String and numeric > types, and as mentioned in some blog, it seems that the numeric type is much > faster compared to the string type. However, if I sort on a number of fields > from multiple clients, the memory footprint, due to the FieldCache.DEFAULT > impl, increases approximately 5-6 times. If I run this on a machine which > does not have this capacity, will I get a OOM or will there be intense > thrashing for the memory? > > > ----------------------- > Thanks n Regards, > Sandeep Ramesh Khanzode --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org