Re: Question regarding sorting and memory consumption in lucene

2008-10-15 Thread mark harwood
Further to our discussion - see below a class that measures the added construction cost and memory savings for an optimised field value cache for a given index. The optimisation here being initial use of byte arrays, then shorts, then ints as more unique terms emerge. I imagine the majority of

Re: Question regarding sorting and memory consumption in lucene

2008-10-14 Thread Mark Harwood
Yes, StringIndex's public fields make life awkward. Re initialization - I did think you could try use arrays of byte arrays. First 256 terms can be addressed using just one byte array, on encountering a 257th term an extra byte array is allocated. References to terms then require indexing into

Re: Question regarding sorting and memory consumption in lucene

2008-10-14 Thread Chris Hostetter
: Actually looking at this a little deeper maybe Lucene could/should : automatically be doing this "short" optimisation here? At the moment it can't, the array's in StringIndex are public. The other thing that would be a bit tricky is the initialization ... i can't think of any easy way to kn

Re: Question regarding sorting and memory consumption in lucene

2008-10-13 Thread Ganesh
rds Ganesh - Original Message - From: "mark harwood" <[EMAIL PROTECTED]> To: Sent: Friday, October 10, 2008 6:48 PM Subject: Re: Question regarding sorting and memory consumption in lucene Assuming content is added in chronological order and with no updates to existing

Re: Question regarding sorting and memory consumption in lucene

2008-10-10 Thread mark harwood
Stensby <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Friday, 10 October, 2008 16:45:02 Subject: Re: Question regarding sorting and memory consumption in lucene That's a really good idea Mark! :) Thanks! Will try to see if can make a quick change with your suggestion. (Too bad qu

RE: Question regarding sorting and memory consumption in lucene

2008-10-10 Thread Robert Stewart
: Aleksander M. Stensby [mailto:[EMAIL PROTECTED] Sent: Friday, October 10, 2008 11:45 AM To: java-user@lucene.apache.org Subject: Re: Question regarding sorting and memory consumption in lucene That's a really good idea Mark! :) Thanks! Will try to see if can make a quick change with your sugge

Re: Question regarding sorting and memory consumption in lucene

2008-10-10 Thread Aleksander M. Stensby
: java-user@lucene.apache.org Sent: Friday, 10 October, 2008 15:25:29 Subject: Re: Question regarding sorting and memory consumption in lucene Unfortunately no, since the documents that are added may come form a new "source" containing old documents aswell..:/ I tried deploying our webapp

Re: Question regarding sorting and memory consumption in lucene

2008-10-10 Thread mark harwood
epresent up to 65536 values - capable of representing a date range of 179 years. - Original Message From: mark harwood <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Friday, 10 October, 2008 15:43:35 Subject: Re: Question regarding sorting and memory consumpt

Re: Question regarding sorting and memory consumption in lucene

2008-10-10 Thread Aleksander M. Stensby
hat would require no memory cache at all when sorting. Querying across multiple indexes simultaneously however may present an added complication... - Original Message From: Aleksander M. Stensby <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Friday, 10 October, 2008

Re: Question regarding sorting and memory consumption in lucene

2008-10-10 Thread mark harwood
t; > To: java-user@lucene.apache.org > Sent: Friday, 10 October, 2008 13:51:50 > Subject: Re: Question regarding sorting and memory consumption in lucene > > I'll follow up on my own question... > Let's say that we have 4 years of data, meaning that there will be >

Re: Question regarding sorting and memory consumption in lucene

2008-10-10 Thread Aleksander M. Stensby
ing. Querying across multiple indexes simultaneously however may present an added complication... - Original Message From: Aleksander M. Stensby <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Friday, 10 October, 2008 13:51:50 Subject: Re: Question regarding sortin

Re: Question regarding sorting and memory consumption in lucene

2008-10-10 Thread mark harwood
esent an added complication... - Original Message From: Aleksander M. Stensby <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Friday, 10 October, 2008 13:51:50 Subject: Re: Question regarding sorting and memory consumption in lucene I'll follow up on my own quest

Re: Question regarding sorting and memory consumption in lucene

2008-10-10 Thread Aleksander M. Stensby
I'll follow up on my own question... Let's say that we have 4 years of data, meaning that there will be roughly 4 * 365 = 1460 unique terms for our sort field. For one index, lets say with 30 million docs, the cache should use approx 100mb, or am I wrong? and thus for 6 indexes we would need a