ay, February 11, 2008 7:46 AM
To: java-user@lucene.apache.org
Subject: Re: large term vectors
Hi Marc,
Can you give more info about what your field properties are? Your
subject line implies you are storing term vectors, is that the case?
Also, what version of Lucene are you using?
Cheers,
Gran
..is there some way to optimize
around this?
Marc
-Original Message-
From: Grant Ingersoll [mailto:[EMAIL PROTECTED]
Sent: Monday, February 11, 2008 7:46 AM
To: java-user@lucene.apache.org
Subject: Re: large term vectors
Hi Marc,
Can you give more info about what your field proper
ms to have something to do with the
norms
(SegmentReader.norms)
Marc
-Original Message-
From: Cedric Ho [mailto:[EMAIL PROTECTED]
Sent: Sunday, February 10, 2008 9:19 PM
To: java-user@lucene.apache.org
Subject: Re: large term vectors
Is it a single index ? My index is also in the 200G
Hi Marc,
Can you give more info about what your field properties are? Your
subject line implies you are storing term vectors, is that the case?
Also, what version of Lucene are you using?
Cheers,
Grant
On Feb 8, 2008, at 10:51 AM, <[EMAIL PROTECTED]> <[EMAIL PROTECTED]
> wrote:
Hi,
I guess it would be quite different for different apps.
For me, I do index update on a single machine: index each incoming
documents into one chunk according to some rule to ensure even
distribution. Then copy all the updated indexes to some other machines
for searching. Each machine will then reo
So, I have a question about 'splitting indexes'. I see people say
this all over, but how have people been handling this. I'm going to
start a new thread, and there probably was one back in the day, but I
am going to fire it up again. But, how did you do it?
On Feb 10, 2008 9:18 PM, Cedric Ho <
Is it a single index ? My index is also in the 200G range, but I never
managed to get
a single index of size > 20G and still get acceptable performance (in
both searching and updating).
So I split my indexes into chunks of < 10G
I am curious as to how you manage such a single large index.
Cedric
Hi,
I have a large index which is around 275GB. As I search different parts
of the index, the memory footprint grows with large byte arrays being
stored. They never seem to get unloaded or GC'ed. Is there any way to
control this behavior so that I can periodically unload cached
information?