Thanks for all your ideas. I was expecting the sorting related fix in 3.0 but hopefully it would be great, if it is get in to 3.1.
I am using Integer for datetime. As the data grows, I am hitting the upper limit. As my application is part of the product, used in different environment, We cannot request all customers to increase the memory. We need to run in less and find some other way out. One option most of them in the group discussed about using EHcache. Let consider the below data get indexed. unique_id is the id generated for every record. unique_id, field1, field2, date_time In Ehcache, Consider I am storing unique_id, date_time How could i merge the results from Lucene and Ehcache? Do I need to fetch all the search results and compare it against the EHcache results and decide (using FieldComparatorSource). Could some one help me how to go about with it. Consider the data is static and there will be updates, modification but the uniqueid and date_time is not going to change. I cannot use docid from lucene as it is changing after updates. (future thought / research) One more thought, Is there any way to write the index in sorted order, May be while merging. Assign docid by sorting the selected field. This way we could achieve the sorting by zero RAM utilization. Mostly the sorted field is fixed for all application. Just some interest to know these things.... Regards Ganesh ----- Original Message ----- From: "Toke Eskildsen" <t...@statsbiblioteket.dk> To: <java-user@lucene.apache.org>; "Toke Eskildsen" <t...@statsbiblioteket.dk> Sent: Thursday, December 17, 2009 9:33 PM Subject: RE: External sort Sigh... Forest for the trees, Toke. The date time represented as an integer _is_ the order and at the same time a global external representation. No need to jump through all the hoops I made. I had my mind full of a more general solution to the memory problem with sort. Instead of the order-array being a long[] with order and datetime, it should be just an int[] with datetime. The FieldCacheImpl does this for INT-sorts , so there's no need for extra code if you just store the datetime as an integer (something like Integer.toString(datetimeAsInt) for the field-value) and use SortField(fieldname, SortField.INT) to sort with. If you cannot store the datetime as an integer (e.g. if you don't control the indexing), you can use the FieldCacheImpl with a custom int-parser that translates your datetime representation to int. The internal representation takes 4 bytes/document. If you need to go lower than that, I'll say you have a very uncommon setup. It can be done by making custom code and storing the order-array on disk, but access-speed would suffer tremendously for searches with millions of hits. An alternative would be to reduce the granularity of the datetime and use SortField.SHORT or ShortField.BYTE. A third alternative would be to count the number of unique datetime-values and make a compressed representation, but that would make the creation of the order-array more complex. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org Send instant messages to your online friends http://in.messenger.yahoo.com --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org