Further to our discussion - see below a class that measures the added
construction cost and memory savings for an optimised field value cache for a
given index.
The optimisation here being initial use of byte arrays, then shorts, then ints
as more unique terms emerge.
I imagine the majority of
Yes, StringIndex's public fields make life awkward. Re initialization - I did
think you could try use arrays of byte arrays. First 256 terms can be addressed
using just one byte array, on encountering a 257th term an extra byte array is
allocated. References to terms then require indexing into
: Actually looking at this a little deeper maybe Lucene could/should
: automatically be doing this "short" optimisation here?
At the moment it can't, the array's in StringIndex are public.
The other thing that would be a bit tricky is the initialization ... i
can't think of any easy way to kn
rds
Ganesh
- Original Message -
From: "mark harwood" <[EMAIL PROTECTED]>
To:
Sent: Friday, October 10, 2008 6:48 PM
Subject: Re: Question regarding sorting and memory consumption in lucene
Assuming content is added in chronological order and with no updates to
existing
Stensby <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Friday, 10 October, 2008 16:45:02
Subject: Re: Question regarding sorting and memory consumption in lucene
That's a really good idea Mark! :)
Thanks! Will try to see if can make a quick change with your suggestion.
(Too bad qu
: Aleksander M. Stensby [mailto:[EMAIL PROTECTED]
Sent: Friday, October 10, 2008 11:45 AM
To: java-user@lucene.apache.org
Subject: Re: Question regarding sorting and memory consumption in lucene
That's a really good idea Mark! :)
Thanks! Will try to see if can make a quick change with your sugge
: java-user@lucene.apache.org
Sent: Friday, 10 October, 2008 15:25:29
Subject: Re: Question regarding sorting and memory consumption in lucene
Unfortunately no, since the documents that are added may come form a new
"source" containing old documents aswell..:/
I tried deploying our webapp
epresent up to 65536 values - capable of representing a date
range of 179 years.
- Original Message
From: mark harwood <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Friday, 10 October, 2008 15:43:35
Subject: Re: Question regarding sorting and memory consumpt
hat would require no memory cache at all when sorting.
Querying across multiple indexes simultaneously however may present an
added complication...
- Original Message
From: Aleksander M. Stensby <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Friday, 10 October, 2008
t;
> To: java-user@lucene.apache.org
> Sent: Friday, 10 October, 2008 13:51:50
> Subject: Re: Question regarding sorting and memory consumption in lucene
>
> I'll follow up on my own question...
> Let's say that we have 4 years of data, meaning that there will be
>
ing.
Querying across multiple indexes simultaneously however may present an
added complication...
- Original Message
From: Aleksander M. Stensby <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Friday, 10 October, 2008 13:51:50
Subject: Re: Question regarding sortin
esent an added
complication...
- Original Message
From: Aleksander M. Stensby <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Friday, 10 October, 2008 13:51:50
Subject: Re: Question regarding sorting and memory consumption in lucene
I'll follow up on my own quest
I'll follow up on my own question...
Let's say that we have 4 years of data, meaning that there will be roughly
4 * 365 = 1460 unique terms for our sort field.
For one index, lets say with 30 million docs, the cache should use approx
100mb, or am I wrong? and thus for 6 indexes we would need a
13 matches
Mail list logo