Hello Eric, I agree, the number of unique terms might be less, but [ 4 * reader.maxdoc() * different fields ] will increase the memory consumption. I am having 100 million records spread across 10 DB. 4 * 100M is itself 400 MB. If i try to use 2 fields for sorting then it would be 800 MB. The unique terms may be less but this array itself consume huge memory.
I am not able to understand you approach of splitting date time would reduce memory consumption. Regards Ganesh ----- Original Message ----- From: "Erick Erickson" <erickerick...@gmail.com> To: <java-user@lucene.apache.org> Sent: Thursday, July 23, 2009 1:55 AM Subject: Re: Alternative way to simulate sorting without doing actual sort >I was assuming you were storing things as strings, in which case > it works something like this: > Let's say you broke it up into > YYYY > MM > DD > HH > MM > > The number of unique terms that need to be kept in > memory to sort is just (let's say your documents > span 100 years) > 100 + 12 + 31 + 24 + 60. > > But that's a much different case. > > On Wed, Jul 22, 2009 at 5:31 AM, Ganesh <emailg...@yahoo.co.in> wrote: > >> Hello Eric, >> >> Thanks for your reply. >> >> Memory reqd for sorting: 4 * reader.maxdoc() >> . >> I am sorting datetime with minute resolution. 100 records are representing >> a minute then in a 1 million record database, there will be around 20000 >> unique terms. the amount of memory consumed would be 4 * 1000000 + 20000 * 8 >> [Considering date time as Long] >> >> The more amount of memory consumed by 4 * reader.maxdoc. If i have two or >> three fields say (YYYMMDD, hh, mm) then the amount of memory consumption >> would be too high. How could you say that splitting the field will help in >> reducing the memory usage. >> >> Please correct me if i am wrong. I require some justification to split the >> date in to multiple terms. >> >> Regards >> Ganesh >> >> >> ----- Original Message ----- >> From: "Erick Erickson" <erickerick...@gmail.com> >> To: <java-user@lucene.apache.org> >> Sent: Tuesday, July 21, 2009 7:29 PM >> Subject: Re: Alternative way to simulate sorting without doing actual sort >> >> >> > Have you tried splitting your times into separate fields, perhaps one >> with >> > YYYYMMDD and another with HHMM, then do a primary sort on the YYYMMDD and >> > secondary on HHMM. That'll reduce your total unique values greatly and >> > should improve your memory consumption. >> > Best >> > Erick >> > >> > On Tue, Jul 21, 2009 at 4:27 AM, Ganesh <emailg...@yahoo.co.in> wrote: >> > >> >> Hello all >> >> >> >> I am sorting on datetime with minute resolution. It easily reaches the >> >> maximum heap size. I am having almost 100M records and it is using 1.5 >> GB. I >> >> am now in a situitation to stop sorting and to find some other >> alternative >> >> way. >> >> >> >> I tried adding document boost and field boost for date time. document >> boost >> >> alone is not working. document boost and field boost has impact on >> score. >> >> Search on datetime gives me the sorted datetime results but search on >> any >> >> other field didn't works. >> >> >> >> I am doing updates and it changes the doc id.. I want to get the results >> >> sorted by FIRST TIME inserted order. Updates should not disturb the >> results >> >> set. I think Solr has some facilities to get the list of recently added >> >> documents. >> >> >> >> Any ideas are greatly appreciated. >> >> >> >> Regards >> >> GaneshSend instant messages to your online friends >> >> http://in.messenger.yahoo.com >> >> >> >> --------------------------------------------------------------------- >> >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> >> >> >> > >> Send instant messages to your online friends http://in.messenger.yahoo.com >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> > Send instant messages to your online friends http://in.messenger.yahoo.com --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org