Just after the indexing process is complete, when I try to run a simple
query, the application hits OutOfMemoryError: Java Heap Space
The InfoReader log reports 'hit exception during NRT Reader'
<http://lucene.472066.n3.nabble.com/file/n4345589/exception_during_nrt_reader.png>
Also
exceeded by so
much though.
Average document size is much smaller, definitely below 100K. Handling large
documents is relatively atypical, but when we get them there are a
relatively large number of them to be processed together.
--
View this message in context:
http://lucene.472066.n3.nabble.
is a
legal context where you need to be able to see, and eventually look at, all
of the documents matching a query (even if they are 100+M).
Thanks Erick!
--
View this message in context:
http://lucene.472066.n3.nabble.com/OutOfMemoryError-indexing-large-documents-tp4170983p4171212.html
Sent from
On Wed, Nov 26, 2014 at 2:09 PM, Erick Erickson wrote:
> Well
> 2> seriously consider the utility of indexing a 100+M file. Assuming
> it's mostly text, lots and lots and lots of queries will match it, and
> it'll score pretty low due to length normalization. And you probably
> can't return it to
the above strategy would be reasonable, or do you need to process
large numbers of large documents.
-- Jack Krupansky
-Original Message-
From: ryanb
Sent: Tuesday, November 25, 2014 7:39 PM
To: java-user@lucene.apache.org
Subject: OutOfMemoryError indexing large documents
Hello,
We
mes need to index
> large documents (100+ MB), but this results in extremely high memory usage,
> to the point of OutOfMemoryError even with 17GB of heap. We allow up to 20
> documents to be indexed simultaneously, but the text to be analyzed and
> indexed is streamed, not loaded into mem
Hello,
We use vanilla Lucene 4.9.0 in a 64 bit Linux OS. We sometimes need to index
large documents (100+ MB), but this results in extremely high memory usage,
to the point of OutOfMemoryError even with 17GB of heap. We allow up to 20
documents to be indexed simultaneously, but the text to be
uot;;
Subject: Re: OutOfMemoryError throwed by SimpleMergedSegmentWarmer
Norms are not stored sparsely by the default codec.
So they take 1 byte per doc per indexed field regardless of whether
that doc had that field.
There is no setting to turn this off in IndexReader, though you could
make
We have 8 million documents and our jvm heap is 5G.
>
>
> Thanks & Best Regards!
>
>
>
>
>
>
> -- Original --
> From: "Michael McCandless";;
> Date: Sat, Sep 13, 2014 06:29 PM
> To: "Lucene Users";
&
ava-user";
Subject: Re: OutOfMemoryError throwed by SimpleMergedSegmentWarmer
Hi, Mike
In our use case, we have thousands of index fields, different kind of
document have different fields. Do you meant that norms field will consume
large memory? Why?
If we decide to disabl
documents and our jvm heap is 5G.
Thanks & Best Regards!
-- Original --
From: "Michael McCandless";;
Date: Sat, Sep 13, 2014 06:29 PM
To: "Lucene Users";
Subject: Re: OutOfMemoryError throwed by SimpleMergedSegmentWarmer
The w
regardless
of whether that doc had indexed that field), or increase HEAP to the
JVM.
Mike McCandless
http://blog.mikemccandless.com
On Sat, Sep 13, 2014 at 4:25 AM, 308181687 <308181...@qq.com> wrote:
> Hi, all
> we got an OutOfMemoryError throwed by SimpleMergedSegmentWarmer. We use
Hi, all
we got an OutOfMemoryError throwed by SimpleMergedSegmentWarmer. We use
lucene 4.7, and access index file by NRTCachingDirectory/MMapDirectory. Could
any body give me a hand? Strack trace is as follows:
org.apache.lucene.index.MergePolicy$MergeException
When you open this index for searching, how much heap do you give it?
In general, you should give IndexWriter the same heap size, since
during merge it will need to open N readers at once, and if you have
RAM resident doc values fields, those need enough heap space.
Also, the default DocValuesForm
With forceMerge(1) throwing an OOM error, we switched to
forceMergeDeletes() which worked for a while, but that is now also
running out of memory. As a result, I've turned all manner of forced
merges off.
I'm more than a little apprehensive that if the OOM error can happen as
part of a force
]
Sent: Thursday, September 26, 2013 12:26 PM
To: java-user@lucene.apache.org
Cc: Ian Lea
Subject: Re: Lucene 4.4.0 mergeSegments OutOfMemoryError
Yes, it happens as part of the early morning optimize, and yes, it's a
forceMerge(1) which I've disabled for now.
I haven't looked at
; From: Michael van Rooyen [mailto:mich...@loot.co.za]
> Sent: Thursday, September 26, 2013 12:26 PM
> To: java-user@lucene.apache.org
> Cc: Ian Lea
> Subject: Re: Lucene 4.4.0 mergeSegments OutOfMemoryError
>
> Yes, it happens as part of the early morning optimize, and yes, it
Yes, it happens as part of the early morning optimize, and yes, it's a
forceMerge(1) which I've disabled for now.
I haven't looked at the persistence mechanism for Lucene since 2.x, but
if I remember correctly, the deleted documents would stay in an index
segment until that segment was eventua
Is this OOM happening as part of your early morning optimize or at
some other point? By optimize do you mean IndexWriter.forceMerge(1)?
You really shouldn't have to use that. If the index grows forever
without it then something else is going on which you might wish to
report separately.
--
Ian.
We've recently upgraded to Lucene 4.4.0 and mergeSegments now causes an
OOM error.
As background, our index contains about 14 million documents (growing
slowly) and we process about 1 million updates per day. It's about 8GB
on disk. I'm not sure if the Lucene segments merge the way they used
Hi!
I'm trying to make an index of several text documents.
Their content is just field tab-separated strings:
word<\t>w1<\t>w2<\t>...<\t>wn
pos<\t>pos1<\t>pos2_a:pos2_b:pos2_c<\t>...<\t>posn_a:posn_b
...
There are 5 documents with the total of 10 MB in size.
While indexing, java uses about 2 GB o
You should set your RAMBufferSizeMB to something smaller than the full
heap size of your JVM.
Mike McCandless
http://blog.mikemccandless.com
On Sat, Jan 26, 2013 at 11:39 PM, wgggfiy wrote:
> I found it is very easy to come into OutOfMemoryError.
> My idea is that lucene could set t
I found it is very easy to come into OutOfMemoryError.
My idea is that lucene could set the RAM memory Automatically,
but I couldn't find the API. My code:
IndexWriterConfig iwc = new IndexWriterConfig(Version.LUCENE_40, analyzer);
int mb = 1024 * 1024;
double ram = Runtime.getRu
ok, found it:
we are using Cloudera CDHu3u, they change the ulimit for child jobs.
but I still don't know how to change their default settings yet
On Wed, Jun 13, 2012 at 2:15 PM, Yang wrote:
> I got the OutOfMemoryError when I tried to open an Lucene index.
>
> it's very
,
Tamara
- Original Message -
> From: "Otis Gospodnetic"
> To: java-user@lucene.apache.org
> Sent: Tuesday, October 18, 2011 11:14:12 PM
> Subject: Re: OutOfMemoryError
>
> Bok Tamara,
>
> You didn't say what -Xmx value you are using. Try a little higher
&
Hi,
> ...I get around 3
> million hits. Each of the hits is processed and information from a certain
> field is
> used.
Thats of course fine, but:
> After certain number of hits, somewhere around 1 million (not always the same
> number) I get OutOfMemory exception that looks like this:
You did
s
>
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> Lucene ecosystem search :: http://search-lucene.com/
>
>
> >
> >From: Tamara Bobic
> >To: java-user@lucene.apache.org
> >Cc: Roman Klinger
> >Sent: Tues
ucene ecosystem search :: http://search-lucene.com/
>
>From: Tamara Bobic
>To: java-user@lucene.apache.org
>Cc: Roman Klinger
>Sent: Tuesday, October 18, 2011 12:21 PM
>Subject: OutOfMemoryError
>
>Hi all,
>
>I am using Lucene to quer
Hi all,
I am using Lucene to query Medline abstracts and as a result I get around 3
million hits. Each of the hits is processed and information from a certain
field is used.
After certain number of hits, somewhere around 1 million (not always the same
number) I get OutOfMemory exception that l
Complicated with all those indexes.
3 suggestions:
1. Just give it more memory.
2. Profile it to find out what is actually using the memory.
3. Cut down the number of indexes. See recent threads on pros and
cons of multiple indexes vs one larger index.
--
Ian.
On Mon, Jun 20, 2011 at 2:
Hi Erick,
In continuation to my below mails, I have a socket based multithreaded
server that serves in average 1 request per second.
The index size is 31GB and document count is about 22 millions.
The index directories are first divided in 4 directories and then each
subdivided to 21 directories.
Hi Erick,
i will gather the info and let u know.
thanks
harsh
On 6/17/11, Erick Erickson wrote:
> Please review:
> http://wiki.apache.org/solr/UsingMailingLists
>
> You've given us no information to go on here, what are you
> trying to do when this happens? What have you tried? What
> is the quer
Please review:
http://wiki.apache.org/solr/UsingMailingLists
You've given us no information to go on here, what are you
trying to do when this happens? What have you tried? What
is the query you're running when this happens? How much
memory are you allocating to the JVM?
You're apparently sorting
Hi List,
Can anyone show any light why some times I am getting below error and
application hangs up:
I am using lucene 3.1.
java.lang.RuntimeException: java.util.concurrent.ExecutionException:
java.lang.RuntimeException: java.util.concurrent.ExecutionException:
java.lang.OutOfMemoryError: Java h
as 12 fields).
> My jvm has 70Mb of RAM memory (limited by my hosting).
> I am getting various OutOfMemoryError.
> I ran jmap and I got:
>
> num #instances #bytes Class description
> --
> 1:
Hi,
I am using Lucene 2.9.4 with FSDirectory.
My index has 80 thousand documents (each document has 12 fields).
My jvm has 70Mb of RAM memory (limited by my hosting).
I am getting various OutOfMemoryError.
I ran jmap and I got:
num #instances #bytes Class description
Hi,
I am using Lucene 2.9.4 with FSDirectory.
My index has 80 thousand documents (each document has 12 fields).
My jvm has 70Mb of RAM memory (limited by my hosting).
I am getting various OutOfMemoryError.
I ran jmap and I got:
num #instances #bytes Class description
Hi,
I am using Lucene 2.9.4 with FSDirectory.
My index has 80 thousand documents (each document has 12 fields).
My jvm has 70Mb of RAM memory (limited by my hosting).
I am getting various OutOfMemoryError.
I ran jmap and I got:
num #instances #bytes Class description
Claudio wrote:
Hi,
I am using Lucene 2.9.4 with FSDirectory.
My index has 80 thousand documents (each document has 12 fields).
My jvm has 70Mb of RAM memory (limited by my hosting).
I am getting various OutOfMemoryError.
I ran jmap and I got:
num #instances#bytesCl
using Lucene 2.9.4 with FSDirectory.
> My index has 80 thousand documents (each document has 12 fields).
> My jvm has 70Mb of RAM memory (limited by my hosting).
> I am getting various OutOfMemoryError.
> I ran jmap and I got:
>
> num
Hi,
I am using Lucene 2.9.4 with FSDirectory.
My index has 80 thousand documents (each document has 12 fields).
My jvm has 70Mb of RAM memory (limited by my hosting).
I am getting various OutOfMemoryError.
I ran jmap and I got:
num #instances#bytesClass description
: Monique Monteiro
> > To: java-user@lucene.apache.org
> > Sent: Fri, March 5, 2010 1:38:31 PM
> > Subject: OutOfMemoryError
> >
> > Hi all,
> >
> >
> >
> > I’m new to Lucene and I’m evaluating it in a web application which
> looks
>
ssage
> From: Monique Monteiro
> To: java-user@lucene.apache.org
> Sent: Fri, March 5, 2010 1:38:31 PM
> Subject: OutOfMemoryError
>
> Hi all,
>
>
>
> I’m new to Lucene and I’m evaluating it in a web application which looks
> up strings in a huge index –
around
950MB. I did some optimization in order to share some fields in two
“composed” indices, but in a web application with less than 1GB for JVM,
OutOfMemoryError is generated. It seems that the searcher keeps some form of
cache which is not frequently released.
I’d like to know if this kind of
ww.thetaphi.de
eMail: u...@thetaphi.de
-Original Message-
From: Nuno Seco [mailto:ns...@dei.uc.pt]
Sent: Thursday, November 12, 2009 6:08 PM
To: java-user@lucene.apache.org
Subject: Re: OutOfMemoryError when using Sort
Ok. Thanks.
The doc. says:
"Finds the top |n| hits for |que
gt; Subject: Re: OutOfMemoryError when using Sort
>
> Ok. Thanks.
>
> The doc. says:
> "Finds the top |n| hits for |query|, applying |filter| if non-null, and
> sorting the hits by the criteria in |sort|."
>
> I understood that only the hits (50 in this) for the c
t;> You need to shard your index (break it up onto multiple machines, do your
>> sort
>> distributed, and merge the results) if you want to do this sorting with
>> any
>> kind
>> of performance.
>>
>> -jake
>>
>> On Thu, Nov 12, 2009 at 7:57 AM, Nu
(search), null, 50, sort);
Every time I execute a query I get an OutOfMemoryError exception.
But if I execute the query without the Sort object it works fine
Let me briefly explain how my index is structured.
I'm indexing the Google 5Grams
(
http://googleresearch.blogspot.com/2006/08/all-our-n-
t to do this sorting with any
kind
of performance.
-jake
On Thu, Nov 12, 2009 at 7:57 AM, Nuno Seco wrote:
> Hello List.
>
> I'm having a problem when I add a Sort object to my searcher:
> docs = searcher.search(parser.parse(search), null, 50, sort);
>
> Every time
nal Message-
> From: Nuno Seco [mailto:ns...@dei.uc.pt]
> Sent: Thursday, November 12, 2009 4:58 PM
> To: java-user@lucene.apache.org
> Subject: OutOfMemoryError when using Sort
>
> Hello List.
>
> I'm having a problem when I add a Sort object to my search
Hello List.
I'm having a problem when I add a Sort object to my searcher:
docs = searcher.search(parser.parse(search), null, 50, sort);
Every time I execute a query I get an OutOfMemoryError exception.
But if I execute the query without the Sort object it works fine
Let me briefly ex
etreff: Re: OutOfMemoryError using IndexWriter
Interesting that excessive deletes buffering is not your problem...
Even if you can't post the resulting test case, if you can simplify it
& run locally, to rule out anything outside Lucene that's allocating
the byte/char/byte[] arrays, that ca
Michael McCandless [mailto:luc...@mikemccandless.com]
> Gesendet: Do 25.06.2009 13:13
> An: java-user@lucene.apache.org
> Betreff: Re: OutOfMemoryError using IndexWriter
>
> Can you post your test code? If you can make it a standalone test,
> then I can repro and dig down faster.
it is
similar to creating a new IndexWriter.
HTH,
Stefan
-Ursprüngliche Nachricht-
Von: Michael McCandless [mailto:luc...@mikemccandless.com]
Gesendet: Do 25.06.2009 13:13
An: java-user@lucene.apache.org
Betreff: Re: OutOfMemoryError using IndexWriter
Can you post your test code? If yo
OK it looks like no merging was done.
I think the next step is to call
IndexWriter.setMaxBufferedDeleteTerms(1000) and see if that prevents
the OOM.
Mike
On Thu, Jun 25, 2009 at 7:16 AM, stefan wrote:
> Hi,
>
> Here are the result of CheckIndex. I ran this just after I got the OOError.
>
> OK [4
Hi,
Here are the result of CheckIndex. I ran this just after I got the OOError.
OK [4 fields]
test: terms, freq, prox...OK [509534 terms; 9126904 terms/docs pairs;
4933036 tokens]
test: stored fields...OK [148124 total field count; avg 2 fields per
doc]
test: term vectors...
rays, I will need some
>> more time for this.
>>
>> Stefan
>>
>> -Ursprüngliche Nachricht-
>> Von: Michael McCandless [mailto:luc...@mikemccandless.com]
>> Gesendet: Mi 24.06.2009 17:50
>> An: java-user@lucene.apache.org
>> Betreff: Re: OutO
time for this.
>
> Stefan
>
> -Ursprüngliche Nachricht-
> Von: Michael McCandless [mailto:luc...@mikemccandless.com]
> Gesendet: Mi 24.06.2009 17:50
> An: java-user@lucene.apache.org
> Betreff: Re: OutOfMemoryError using IndexWriter
>
> On Wed, Jun 24, 2009 at 10:1
...@mikemccandless.com]
Gesendet: Mi 24.06.2009 17:50
An: java-user@lucene.apache.org
Betreff: Re: OutOfMemoryError using IndexWriter
On Wed, Jun 24, 2009 at 10:18 AM, stefan wrote:
>
> Hi,
>
>
>>OK so this means it's not a leak, and instead it's just that stuff
On Thu, Jun 25, 2009 at 3:02 AM, stefan wrote:
>>But a "leak" would keep leaking over time, right? Ie even a 1 GB heap
>>on your test db should eventually throw OOME if there's really a leak.
> No not necessarily, since I stop indexing ones everything is indexed - I
> shall try repeated runs wit
Hi,
>But a "leak" would keep leaking over time, right? Ie even a 1 GB heap
>on your test db should eventually throw OOME if there's really a leak.
No not necessarily, since I stop indexing ones everything is indexed - I shall
try repeated runs with 120MB.
>Are you calling updateDocument (which
On Wed, Jun 24, 2009 at 10:23 AM, stefan wrote:
> does Lucene keep the complete index in memory ?
No.
Certain things (deleted docs, norms, field cache, terms index) are
loaded into memory, but these are tiny compared to what's not loaded
into memory (postings, stored docs, term vectors).
> As s
On Wed, Jun 24, 2009 at 10:18 AM, stefan wrote:
>
> Hi,
>
>
>>OK so this means it's not a leak, and instead it's just that stuff is
>>consuming more RAM than expected.
> Or that my test db is smaller than the production db which is indeed the case.
But a "leak" would keep leaking over time, right?
: stefan [mailto:ste...@intermediate.de]
Sent: Wednesday, June 24, 2009 10:23 AM
To: java-user@lucene.apache.org
Subject: AW: OutOfMemoryError using IndexWriter
Hi,
does Lucene keep the complete index in memory ?
As stated before the result index is 50MB, this would correlate with the memory
some hint, whether this is the case, from the programming side would be
appreciated ...
Stefan
-Ursprüngliche Nachricht-
Von: Sudarsan, Sithu D. [mailto:sithu.sudar...@fda.hhs.gov]
Gesendet: Mi 24.06.2009 16:18
An: java-user@lucene.apache.org
Betreff: RE: OutOfMemoryError using
open.
Please post your results/views.
Sincerely,
Sithu
-Original Message-
From: stefan [mailto:ste...@intermediate.de]
Sent: Wednesday, June 24, 2009 10:08 AM
To: java-user@lucene.apache.org
Subject: AW: OutOfMemoryError using IndexWriter
Hi,
I do use Win32.
What do you mean by
Hi,
>OK so this means it's not a leak, and instead it's just that stuff is
>consuming more RAM than expected.
Or that my test db is smaller than the production db which is indeed the case.
>Hmm -- there are quite a few buffered deletes pending. It could be we
>are under-accounting for RAM used
sendet: Mi 24.06.2009 15:55
An: java-user@lucene.apache.org
Betreff: RE: OutOfMemoryError using IndexWriter
Hi Stefan,
Are you using Windows 32 bit? If so, sometimes, if the index file before
optimizations crosses your jvm memory usage settings (if say 512MB),
there is a possibility of this
IndexWriter for the complete indexing operation, I do not
call optimize but get an OOMError.
Stefan
-Ursprüngliche Nachricht-
Von: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com]
Gesendet: Mi 24.06.2009 14:22
An: java-user@lucene.apache.org
Betreff: Re: OutOfMemoryError using
-2587
sithu.sudar...@fda.hhs.gov
sdsudar...@ualr.edu
-Original Message-
From: stefan [mailto:ste...@intermediate.de]
Sent: Wednesday, June 24, 2009 4:09 AM
To: java-user@lucene.apache.org
Subject: OutOfMemoryError using IndexWriter
Hi,
I am using Lucene 2.4.1 to index a database with less
On Wed, Jun 24, 2009 at 7:43 AM, stefan wrote:
> I tried with 100MB heap size and got the Error as well, it runs fine with
> 120MB.
OK so this means it's not a leak, and instead it's just that stuff is
consuming more RAM than expected.
> Here is the histogram (application classes marked with --
24, 2009 4:08:43 AM
> Subject: OutOfMemoryError using IndexWriter
>
> Hi,
>
> I am using Lucene 2.4.1 to index a database with less than a million records.
> The resulting index is about 50MB in size.
> I keep getting an OutOfMemory Error if I re-use the same IndexWriter to
tefan
-Ursprüngliche Nachricht-
Von: Michael McCandless [mailto:luc...@mikemccandless.com]
Gesendet: Mi 24.06.2009 11:52
An: java-user@lucene.apache.org
Betreff: Re: OutOfMemoryError using IndexWriter
Hmm -- I think your test env (80 MB heap, 50 MB used by app + 16 MB
IndexWriter RAM buffe
) 3268608 (size)
>
> Well, something I should do differently ?
>
> Stefan
>
> -Ursprüngliche Nachricht-
> Von: Michael McCandless [mailto:luc...@mikemccandless.com]
> Gesendet: Mi 24.06.2009 10:48
> An: java-user@lucene.apache.org
> Betreff: Re: OutOfMemory
: OutOfMemoryError using IndexWriter
How large is the RAM buffer that you're giving IndexWriter? How large
a heap size do you give to JVM?
Can you post one of the OOM exceptions you're hitting?
Mike
On Wed, Jun 24, 2009 at 4:08 AM, stefan wrote:
> Hi,
>
> I am using Lucene 2.4.1 to in
How large is the RAM buffer that you're giving IndexWriter? How large
a heap size do you give to JVM?
Can you post one of the OOM exceptions you're hitting?
Mike
On Wed, Jun 24, 2009 at 4:08 AM, stefan wrote:
> Hi,
>
> I am using Lucene 2.4.1 to index a database with less than a million records
Hi,
I am using Lucene 2.4.1 to index a database with less than a million records.
The resulting index is about 50MB in size.
I keep getting an OutOfMemory Error if I re-use the same IndexWriter to index
the complete database. This is though
recommended in the performance hints.
What I now do is
I am very interested indeed, do I understand correctly that the tweak
you made reduces the memory when searching if you have many docs in
the index?? I am omitting norms too.
If that is the case, can someone point me to what is hte required
change that should be done? I understand from Yoniks comm
On Mon, 2008-01-07 at 14:20 -0800, Otis Gospodnetic wrote:
> Please post your results, Lars!
Tried the patch, and it failed to compile (plain Lucene compiled fine).
In the process, I looked at TermQuery and found that it'd be easier to
copy that code and just hardcode 1.0f for all norms. Did tha
On Jan 7, 2008 5:00 AM, Lars Clausen <[EMAIL PROTECTED]> wrote:
> Doesn't appear to be the case in our test. We had two fields with
> norms, omitting saved only about 4MB for 50 million entries.
It should be 50MB. If you are measuring with an external tool, then
that tool is probably in error.
Please post your results, Lars!
Thanks,
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
From: Lars Clausen <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Monday, January 7, 2008 5:00:54 AM
Subject: Re: OutOfMemoryError on small sea
On Tue, 2008-01-01 at 23:38 -0800, Chris Hostetter wrote:
> : On Wed, 2007-12-12 at 11:37 +0100, Lars Clausen wrote:
>
> : Seems there's a reason we still use all this memory:
> : SegmentReader.fakeNorms() creates the full-size array for us anyway, so
> : the memory usage cannot be avoided as lon
: On Wed, 2007-12-12 at 11:37 +0100, Lars Clausen wrote:
: Seems there's a reason we still use all this memory:
: SegmentReader.fakeNorms() creates the full-size array for us anyway, so
: the memory usage cannot be avoided as long as somebody asks for the
: norms array at any point. The solution
On Wed, 2007-12-12 at 11:37 +0100, Lars Clausen wrote:
> I've now made trial runs with no norms on the two indexed fields, and
> also tried with varying TermIndexIntervals. Omitting the norms saves
> about 4MB on 50 million entries, much less than I expected.
Seems there's a reason we still use
On Wed, 2007-12-12 at 11:37 +0100, Lars Clausen wrote:
> Increasing
> the TermIndexInterval by a factor of 4 gave no measurable savings.
Following up on myself because I'm not 100% sure that the indexes have
the term index intervals I expect, and I'd like to check. Where can I
see what term ind
On Tue, 2007-11-13 at 07:26 -0800, Chris Hostetter wrote:
> : > Can it be right that memory usage depends on size of the index rather
> : > than size of the result?
> :
> : Yes, see IndexWriter.setTermIndexInterval(). How much RAM are you giving to
> : the JVM now?
>
> and in general: yes. Luc
: > Can it be right that memory usage depends on size of the index rather
: > than size of the result?
:
: Yes, see IndexWriter.setTermIndexInterval(). How much RAM are you giving to
: the JVM now?
and in general: yes. Lucene is using memory so that *lots* of searches
can be fast ... if you r
On Dienstag, 13. November 2007, Lars Clausen wrote:
> Can it be right that memory usage depends on size of the index rather
> than size of the result?
Yes, see IndexWriter.setTermIndexInterval(). How much RAM are you giving to
the JVM now?
Regards
Daniel
--
http://www.danielnaber.de
---
We've run into a blocking problem with our use of Lucene: we get
OutOfMemoryError when performing a one-term search in our index. The
search, if completed, should give only a few thousand hits, but from
inspecting a heap dump it appears that many more documents in the index
get stored in L
://www.nabble.com/OutOfMemoryError%3A-allocLargeArray-tf4435037.html#a12652765
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL
> Från: Chris Hostetter [mailto:[EMAIL PROTECTED]
> : Setting writer.setMaxFieldLength(5000) (default is 1)
> : seems to eliminate the risk for an OutOfMemoryError,
>
> that's because it now gives up after parsing 5000 tokens.
>
> : To me, it appears that simpl
: Setting writer.setMaxFieldLength(5000) (default is 1)
: seems to eliminate the risk for an OutOfMemoryError,
that's because it now gives up after parsing 5000 tokens.
: To me, it appears that simply calling
:new Field("content", new InputStreamReader(in, "ISO-88
Aha, that's interesting. However...
Setting writer.setMaxFieldLength(5000) (default is 1)
seems to eliminate the risk for an OutOfMemoryError,
even with a JVM with only 64 MB max memory.
(I have tried larger values for JVM max memory, too).
(The name is imho slightly misleading, I
tent", in);
The text file is large, 20 MB, and contains zillions lines,
each with the the same 100-character token.
That causes an OutOfMemoryError.
Given that all tokens are the *same*,
why should this cause an OutOfMemoryError?
Shouldn't StandardAnalyzer just chug along
and just note
k
>
> On 8/31/07, Per Lindberg <[EMAIL PROTECTED]> wrote:
> >
> > I'm creating a tokenized "content" Field from a plain text file
> > using an InputStreamReader and new Field("content", in);
> >
> > The text file is large, 20 MB, a
ECTED]> wrote:
>
> I'm creating a tokenized "content" Field from a plain text file
> using an InputStreamReader and new Field("content", in);
>
> The text file is large, 20 MB, and contains zillions lines,
> each with the the same 100-character tok
I'm creating a tokenized "content" Field from a plain text file
using an InputStreamReader and new Field("content", in);
The text file is large, 20 MB, and contains zillions lines,
each with the the same 100-character token.
That causes an OutOfMemoryError.
Given that
Thx for ur quick reply.
I will go through it.
Rgds,
Jelda
> -Original Message-
> From: mark harwood [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, May 02, 2006 5:03 PM
> To: java-user@lucene.apache.org
> Subject: RE: OutOfMemoryError while enumerating through
> read
"Category counts" should really be a FAQ entry.
There is no one right solution to prescribe because it
depends on the shape of your data.
For previous discussions/code samples see here:
http://www.mail-archive.com/java-user@lucene.apache.org/msg05123.html
and here for more space-efficient repre
mailto:[EMAIL PROTECTED]
> Sent: Tuesday, May 02, 2006 4:41 PM
> To: java-user@lucene.apache.org
> Subject: RE: OutOfMemoryError while enumerating through
> reader.terms(fieldName)
>
> I am trying to implement category count almost similar to
> CNET approach.
> At the initia
1 - 100 of 128 matches
Mail list logo