[lucene-6.3.0] hit tragic OutOfMemoryError inside getReader

2017-07-12 Thread sandesh.yapuram
Just after the indexing process is complete, when I try to run a simple query, the application hits OutOfMemoryError: Java Heap Space The InfoReader log reports 'hit exception during NRT Reader' <http://lucene.472066.n3.nabble.com/file/n4345589/exception_during_nrt_reader.png> Also

Re: OutOfMemoryError indexing large documents

2014-11-26 Thread ryanb
exceeded by so much though. Average document size is much smaller, definitely below 100K. Handling large documents is relatively atypical, but when we get them there are a relatively large number of them to be processed together. -- View this message in context: http://lucene.472066.n3.nabble.

Re: OutOfMemoryError indexing large documents

2014-11-26 Thread ryanb
is a legal context where you need to be able to see, and eventually look at, all of the documents matching a query (even if they are 100+M). Thanks Erick! -- View this message in context: http://lucene.472066.n3.nabble.com/OutOfMemoryError-indexing-large-documents-tp4170983p4171212.html Sent from

Re: OutOfMemoryError indexing large documents

2014-11-26 Thread Trejkaz
On Wed, Nov 26, 2014 at 2:09 PM, Erick Erickson wrote: > Well > 2> seriously consider the utility of indexing a 100+M file. Assuming > it's mostly text, lots and lots and lots of queries will match it, and > it'll score pretty low due to length normalization. And you probably > can't return it to

Re: OutOfMemoryError indexing large documents

2014-11-26 Thread Jack Krupansky
the above strategy would be reasonable, or do you need to process large numbers of large documents. -- Jack Krupansky -Original Message- From: ryanb Sent: Tuesday, November 25, 2014 7:39 PM To: java-user@lucene.apache.org Subject: OutOfMemoryError indexing large documents Hello, We

Re: OutOfMemoryError indexing large documents

2014-11-25 Thread Erick Erickson
mes need to index > large documents (100+ MB), but this results in extremely high memory usage, > to the point of OutOfMemoryError even with 17GB of heap. We allow up to 20 > documents to be indexed simultaneously, but the text to be analyzed and > indexed is streamed, not loaded into mem

OutOfMemoryError indexing large documents

2014-11-25 Thread ryanb
Hello, We use vanilla Lucene 4.9.0 in a 64 bit Linux OS. We sometimes need to index large documents (100+ MB), but this results in extremely high memory usage, to the point of OutOfMemoryError even with 17GB of heap. We allow up to 20 documents to be indexed simultaneously, but the text to be

Re:  OutOfMemoryError throwed by SimpleMergedSegmentWarmer

2014-09-13 Thread 308181687
uot;; Subject: Re: OutOfMemoryError throwed by SimpleMergedSegmentWarmer Norms are not stored sparsely by the default codec. So they take 1 byte per doc per indexed field regardless of whether that doc had that field. There is no setting to turn this off in IndexReader, though you could make

Re: OutOfMemoryError throwed by SimpleMergedSegmentWarmer

2014-09-13 Thread Michael McCandless
We have 8 million documents and our jvm heap is 5G.‍ > > > Thanks & Best Regards! > > > ‍ > > > > -- Original -- > From: "Michael McCandless";; > Date: Sat, Sep 13, 2014 06:29 PM > To: "Lucene Users"; &

Re:   OutOfMemoryError throwed by SimpleMergedSegmentWarmer

2014-09-13 Thread 308181687
ava-user"; Subject: Re: OutOfMemoryError throwed by SimpleMergedSegmentWarmer Hi, Mike ‍ In our use case, we have thousands of index fields, different kind of document have different fields. Do you meant that ‍norms field will consume large memory? Why?‍ If we decide to disabl

Re:  OutOfMemoryError throwed by SimpleMergedSegmentWarmer

2014-09-13 Thread 308181687
documents and our jvm heap is 5G.‍ Thanks & Best Regards! ‍ -- Original -- From: "Michael McCandless";; Date: Sat, Sep 13, 2014 06:29 PM To: "Lucene Users"; Subject: Re: OutOfMemoryError throwed by SimpleMergedSegmentWarmer The w

Re: OutOfMemoryError throwed by SimpleMergedSegmentWarmer

2014-09-13 Thread Michael McCandless
regardless of whether that doc had indexed that field), or increase HEAP to the JVM. Mike McCandless http://blog.mikemccandless.com On Sat, Sep 13, 2014 at 4:25 AM, 308181687 <308181...@qq.com> wrote: > Hi, all > we got an OutOfMemoryError throwed ‍by SimpleMergedSegmentWarmer. We use

OutOfMemoryError throwed by SimpleMergedSegmentWarmer

2014-09-13 Thread 308181687
Hi, all we got an OutOfMemoryError throwed ‍by SimpleMergedSegmentWarmer. We use lucene 4.7, and access index file by NRTCachingDirectory/MMapDirectory. Could any body give me a hand? Strack trace is as follows: org.apache.lucene.index.MergePolicy$MergeException

Re: Lucene 4.4.0 mergeSegments OutOfMemoryError

2013-10-08 Thread Michael McCandless
When you open this index for searching, how much heap do you give it? In general, you should give IndexWriter the same heap size, since during merge it will need to open N readers at once, and if you have RAM resident doc values fields, those need enough heap space. Also, the default DocValuesForm

Re: Lucene 4.4.0 mergeSegments OutOfMemoryError

2013-10-07 Thread Michael van Rooyen
With forceMerge(1) throwing an OOM error, we switched to forceMergeDeletes() which worked for a while, but that is now also running out of memory. As a result, I've turned all manner of forced merges off. I'm more than a little apprehensive that if the OOM error can happen as part of a force

Re: Lucene 4.4.0 mergeSegments OutOfMemoryError

2013-09-26 Thread Michael van Rooyen
] Sent: Thursday, September 26, 2013 12:26 PM To: java-user@lucene.apache.org Cc: Ian Lea Subject: Re: Lucene 4.4.0 mergeSegments OutOfMemoryError Yes, it happens as part of the early morning optimize, and yes, it's a forceMerge(1) which I've disabled for now. I haven't looked at

RE: Lucene 4.4.0 mergeSegments OutOfMemoryError

2013-09-26 Thread Uwe Schindler
; From: Michael van Rooyen [mailto:mich...@loot.co.za] > Sent: Thursday, September 26, 2013 12:26 PM > To: java-user@lucene.apache.org > Cc: Ian Lea > Subject: Re: Lucene 4.4.0 mergeSegments OutOfMemoryError > > Yes, it happens as part of the early morning optimize, and yes, it&#

Re: Lucene 4.4.0 mergeSegments OutOfMemoryError

2013-09-26 Thread Michael van Rooyen
Yes, it happens as part of the early morning optimize, and yes, it's a forceMerge(1) which I've disabled for now. I haven't looked at the persistence mechanism for Lucene since 2.x, but if I remember correctly, the deleted documents would stay in an index segment until that segment was eventua

Re: Lucene 4.4.0 mergeSegments OutOfMemoryError

2013-09-26 Thread Ian Lea
Is this OOM happening as part of your early morning optimize or at some other point? By optimize do you mean IndexWriter.forceMerge(1)? You really shouldn't have to use that. If the index grows forever without it then something else is going on which you might wish to report separately. -- Ian.

Lucene 4.4.0 mergeSegments OutOfMemoryError

2013-09-25 Thread Michael van Rooyen
We've recently upgraded to Lucene 4.4.0 and mergeSegments now causes an OOM error. As background, our index contains about 14 million documents (growing slowly) and we process about 1 million updates per day. It's about 8GB on disk. I'm not sure if the Lucene segments merge the way they used

OutOfMemoryError while indexing

2013-03-17 Thread Igor Shalyminov
Hi! I'm trying to make an index of several text documents. Their content is just field tab-separated strings: word<\t>w1<\t>w2<\t>...<\t>wn pos<\t>pos1<\t>pos2_a:pos2_b:pos2_c<\t>...<\t>posn_a:posn_b ... There are 5 documents with the total of 10 MB in size. While indexing, java uses about 2 GB o

Re: how to avoid OutOfMemoryError while indexing ?

2013-01-27 Thread Michael McCandless
You should set your RAMBufferSizeMB to something smaller than the full heap size of your JVM. Mike McCandless http://blog.mikemccandless.com On Sat, Jan 26, 2013 at 11:39 PM, wgggfiy wrote: > I found it is very easy to come into OutOfMemoryError. > My idea is that lucene could set t

how to avoid OutOfMemoryError while indexing ?

2013-01-26 Thread wgggfiy
I found it is very easy to come into OutOfMemoryError. My idea is that lucene could set the RAM memory Automatically, but I couldn't find the API. My code: IndexWriterConfig iwc = new IndexWriterConfig(Version.LUCENE_40, analyzer); int mb = 1024 * 1024; double ram = Runtime.getRu

Re: OutOfMemoryError when opening the index ?

2012-06-13 Thread Yang
ok, found it: we are using Cloudera CDHu3u, they change the ulimit for child jobs. but I still don't know how to change their default settings yet On Wed, Jun 13, 2012 at 2:15 PM, Yang wrote: > I got the OutOfMemoryError when I tried to open an Lucene index. > > it's very

Re: OutOfMemoryError

2011-10-19 Thread Tamara Bobic
, Tamara - Original Message - > From: "Otis Gospodnetic" > To: java-user@lucene.apache.org > Sent: Tuesday, October 18, 2011 11:14:12 PM > Subject: Re: OutOfMemoryError > > Bok Tamara, > > You didn't say what -Xmx value you are using.  Try a little higher &

RE: OutOfMemoryError

2011-10-18 Thread Uwe Schindler
Hi, > ...I get around 3 > million hits. Each of the hits is processed and information from a certain > field is > used. Thats of course fine, but: > After certain number of hits, somewhere around 1 million (not always the same > number) I get OutOfMemory exception that looks like this: You did

Re: OutOfMemoryError

2011-10-18 Thread Mead Lai
s > > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch > Lucene ecosystem search :: http://search-lucene.com/ > > > > > >From: Tamara Bobic > >To: java-user@lucene.apache.org > >Cc: Roman Klinger > >Sent: Tues

Re: OutOfMemoryError

2011-10-18 Thread Otis Gospodnetic
ucene ecosystem search :: http://search-lucene.com/ > >From: Tamara Bobic >To: java-user@lucene.apache.org >Cc: Roman Klinger >Sent: Tuesday, October 18, 2011 12:21 PM >Subject: OutOfMemoryError > >Hi all, > >I am using Lucene to quer

OutOfMemoryError

2011-10-18 Thread Tamara Bobic
Hi all, I am using Lucene to query Medline abstracts and as a result I get around 3 million hits. Each of the hits is processed and information from a certain field is used. After certain number of hits, somewhere around 1 million (not always the same number) I get OutOfMemory exception that l

Re: getting OutOfMemoryError

2011-06-21 Thread Ian Lea
Complicated with all those indexes. 3 suggestions: 1. Just give it more memory. 2. Profile it to find out what is actually using the memory. 3. Cut down the number of indexes. See recent threads on pros and cons of multiple indexes vs one larger index. -- Ian. On Mon, Jun 20, 2011 at 2:

Re: getting OutOfMemoryError

2011-06-20 Thread harsh srivastava
Hi Erick, In continuation to my below mails, I have a socket based multithreaded server that serves in average 1 request per second. The index size is 31GB and document count is about 22 millions. The index directories are first divided in 4 directories and then each subdivided to 21 directories.

Re: getting OutOfMemoryError

2011-06-17 Thread harsh srivastava
Hi Erick, i will gather the info and let u know. thanks harsh On 6/17/11, Erick Erickson wrote: > Please review: > http://wiki.apache.org/solr/UsingMailingLists > > You've given us no information to go on here, what are you > trying to do when this happens? What have you tried? What > is the quer

Re: getting OutOfMemoryError

2011-06-17 Thread Erick Erickson
Please review: http://wiki.apache.org/solr/UsingMailingLists You've given us no information to go on here, what are you trying to do when this happens? What have you tried? What is the query you're running when this happens? How much memory are you allocating to the JVM? You're apparently sorting

getting OutOfMemoryError

2011-06-17 Thread harsh srivastava
Hi List, Can anyone show any light why some times I am getting below error and application hangs up: I am using lucene 3.1. java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java h

Re: OutOfMemoryError with FSDirectory

2011-04-05 Thread Michael McCandless
as 12 fields). > My jvm has 70Mb of RAM memory (limited by my hosting). > I am getting various OutOfMemoryError. > I ran jmap and I got: > > num   #instances    #bytes    Class description > -- > 1: 

OutOfMemoryError with FSDirectory

2011-04-05 Thread Claudio R
Hi, I am using Lucene 2.9.4 with FSDirectory. My index has 80 thousand documents (each document has 12 fields). My jvm has 70Mb of RAM memory (limited by my hosting). I am getting various OutOfMemoryError. I ran jmap and I got: num   #instances    #bytes    Class description

OutOfMemoryError with FSDirectory

2011-04-05 Thread Claudio R
Hi, I am using Lucene 2.9.4 with FSDirectory. My index has 80 thousand documents (each document has 12 fields). My jvm has 70Mb of RAM memory (limited by my hosting). I am getting various OutOfMemoryError. I ran jmap and I got: num   #instances    #bytes    Class description

OutOfMemoryError with FSDirectory

2011-04-05 Thread Claudio R
Hi, I am using Lucene 2.9.4 with FSDirectory. My index has 80 thousand documents (each document has 12 fields). My jvm has 70Mb of RAM memory (limited by my hosting). I am getting various OutOfMemoryError. I ran jmap and I got: num   #instances    #bytes    Class description

Re: OutOfMemoryError with FSDirectory

2011-04-04 Thread Claudio
Claudio wrote: Hi, I am using Lucene 2.9.4 with FSDirectory. My index has 80 thousand documents (each document has 12 fields). My jvm has 70Mb of RAM memory (limited by my hosting). I am getting various OutOfMemoryError. I ran jmap and I got: num #instances#bytesCl

Re: OutOfMemoryError with FSDirectory

2011-04-04 Thread Erick Erickson
using Lucene 2.9.4 with FSDirectory. > My index has 80 thousand documents (each document has 12 fields). > My jvm has 70Mb of RAM memory (limited by my hosting). > I am getting various OutOfMemoryError. > I ran jmap and I got: > > num

OutOfMemoryError with FSDirectory

2011-04-04 Thread Claudio
Hi, I am using Lucene 2.9.4 with FSDirectory. My index has 80 thousand documents (each document has 12 fields). My jvm has 70Mb of RAM memory (limited by my hosting). I am getting various OutOfMemoryError. I ran jmap and I got: num #instances#bytesClass description

Re: OutOfMemoryError

2010-03-06 Thread Monique Monteiro
: Monique Monteiro > > To: java-user@lucene.apache.org > > Sent: Fri, March 5, 2010 1:38:31 PM > > Subject: OutOfMemoryError > > > > Hi all, > > > > > > > > I’m new to Lucene and I’m evaluating it in a web application which > looks >

Re: OutOfMemoryError

2010-03-05 Thread Otis Gospodnetic
ssage > From: Monique Monteiro > To: java-user@lucene.apache.org > Sent: Fri, March 5, 2010 1:38:31 PM > Subject: OutOfMemoryError > > Hi all, > > > > I’m new to Lucene and I’m evaluating it in a web application which looks > up strings in a huge index –

OutOfMemoryError

2010-03-05 Thread Monique Monteiro
around 950MB. I did some optimization in order to share some fields in two “composed” indices, but in a web application with less than 1GB for JVM, OutOfMemoryError is generated. It seems that the searcher keeps some form of cache which is not frequently released. I’d like to know if this kind of

Re: OutOfMemoryError when using Sort

2009-11-12 Thread Nuno Seco
ww.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Nuno Seco [mailto:ns...@dei.uc.pt] Sent: Thursday, November 12, 2009 6:08 PM To: java-user@lucene.apache.org Subject: Re: OutOfMemoryError when using Sort Ok. Thanks. The doc. says: "Finds the top |n| hits for |que

RE: OutOfMemoryError when using Sort

2009-11-12 Thread Uwe Schindler
gt; Subject: Re: OutOfMemoryError when using Sort > > Ok. Thanks. > > The doc. says: > "Finds the top |n| hits for |query|, applying |filter| if non-null, and > sorting the hits by the criteria in |sort|." > > I understood that only the hits (50 in this) for the c

Re: OutOfMemoryError when using Sort

2009-11-12 Thread Jake Mannix
t;> You need to shard your index (break it up onto multiple machines, do your >> sort >> distributed, and merge the results) if you want to do this sorting with >> any >> kind >> of performance. >> >> -jake >> >> On Thu, Nov 12, 2009 at 7:57 AM, Nu

Re: OutOfMemoryError when using Sort

2009-11-12 Thread Nuno Seco
(search), null, 50, sort); Every time I execute a query I get an OutOfMemoryError exception. But if I execute the query without the Sort object it works fine Let me briefly explain how my index is structured. I'm indexing the Google 5Grams ( http://googleresearch.blogspot.com/2006/08/all-our-n-

Re: OutOfMemoryError when using Sort

2009-11-12 Thread Jake Mannix
t to do this sorting with any kind of performance. -jake On Thu, Nov 12, 2009 at 7:57 AM, Nuno Seco wrote: > Hello List. > > I'm having a problem when I add a Sort object to my searcher: > docs = searcher.search(parser.parse(search), null, 50, sort); > > Every time

RE: OutOfMemoryError when using Sort

2009-11-12 Thread Uwe Schindler
nal Message- > From: Nuno Seco [mailto:ns...@dei.uc.pt] > Sent: Thursday, November 12, 2009 4:58 PM > To: java-user@lucene.apache.org > Subject: OutOfMemoryError when using Sort > > Hello List. > > I'm having a problem when I add a Sort object to my search

OutOfMemoryError when using Sort

2009-11-12 Thread Nuno Seco
Hello List. I'm having a problem when I add a Sort object to my searcher: docs = searcher.search(parser.parse(search), null, 50, sort); Every time I execute a query I get an OutOfMemoryError exception. But if I execute the query without the Sort object it works fine Let me briefly ex

AW: OutOfMemoryError using IndexWriter

2009-06-25 Thread stefan
etreff: Re: OutOfMemoryError using IndexWriter Interesting that excessive deletes buffering is not your problem... Even if you can't post the resulting test case, if you can simplify it & run locally, to rule out anything outside Lucene that's allocating the byte/char/byte[] arrays, that ca

Re: OutOfMemoryError using IndexWriter

2009-06-25 Thread Michael McCandless
Michael McCandless [mailto:luc...@mikemccandless.com] > Gesendet: Do 25.06.2009 13:13 > An: java-user@lucene.apache.org > Betreff: Re: OutOfMemoryError using IndexWriter > > Can you post your test code?  If you can make it a standalone test, > then I can repro and dig down faster.

AW: OutOfMemoryError using IndexWriter

2009-06-25 Thread stefan
it is similar to creating a new IndexWriter. HTH, Stefan -Ursprüngliche Nachricht- Von: Michael McCandless [mailto:luc...@mikemccandless.com] Gesendet: Do 25.06.2009 13:13 An: java-user@lucene.apache.org Betreff: Re: OutOfMemoryError using IndexWriter Can you post your test code? If yo

Re: OutOfMemoryError using IndexWriter

2009-06-25 Thread Michael McCandless
OK it looks like no merging was done. I think the next step is to call IndexWriter.setMaxBufferedDeleteTerms(1000) and see if that prevents the OOM. Mike On Thu, Jun 25, 2009 at 7:16 AM, stefan wrote: > Hi, > > Here are the result of CheckIndex. I ran this just after I got the OOError. > > OK [4

AW: OutOfMemoryError using IndexWriter

2009-06-25 Thread stefan
Hi, Here are the result of CheckIndex. I ran this just after I got the OOError. OK [4 fields] test: terms, freq, prox...OK [509534 terms; 9126904 terms/docs pairs; 4933036 tokens] test: stored fields...OK [148124 total field count; avg 2 fields per doc] test: term vectors...

Re: OutOfMemoryError using IndexWriter

2009-06-25 Thread Simon Willnauer
rays, I will need some >> more time for this. >> >> Stefan >> >> -Ursprüngliche Nachricht- >> Von: Michael McCandless [mailto:luc...@mikemccandless.com] >> Gesendet: Mi 24.06.2009 17:50 >> An: java-user@lucene.apache.org >> Betreff: Re: OutO

Re: OutOfMemoryError using IndexWriter

2009-06-25 Thread Michael McCandless
time for this. > > Stefan > > -Ursprüngliche Nachricht- > Von: Michael McCandless [mailto:luc...@mikemccandless.com] > Gesendet: Mi 24.06.2009 17:50 > An: java-user@lucene.apache.org > Betreff: Re: OutOfMemoryError using IndexWriter > > On Wed, Jun 24, 2009 at 10:1

AW: OutOfMemoryError using IndexWriter

2009-06-25 Thread stefan
...@mikemccandless.com] Gesendet: Mi 24.06.2009 17:50 An: java-user@lucene.apache.org Betreff: Re: OutOfMemoryError using IndexWriter On Wed, Jun 24, 2009 at 10:18 AM, stefan wrote: > > Hi, > > >>OK so this means it's not a leak, and instead it's just that stuff

Re: OutOfMemoryError using IndexWriter

2009-06-25 Thread Michael McCandless
On Thu, Jun 25, 2009 at 3:02 AM, stefan wrote: >>But a "leak" would keep leaking over time, right?  Ie even a 1 GB heap >>on your test db should eventually throw OOME if there's really a leak. > No not necessarily, since I stop indexing ones everything is indexed - I > shall try repeated runs wit

AW: OutOfMemoryError using IndexWriter

2009-06-25 Thread stefan
Hi, >But a "leak" would keep leaking over time, right? Ie even a 1 GB heap >on your test db should eventually throw OOME if there's really a leak. No not necessarily, since I stop indexing ones everything is indexed - I shall try repeated runs with 120MB. >Are you calling updateDocument (which

Re: OutOfMemoryError using IndexWriter

2009-06-24 Thread Michael McCandless
On Wed, Jun 24, 2009 at 10:23 AM, stefan wrote: > does Lucene keep the complete index in memory ? No. Certain things (deleted docs, norms, field cache, terms index) are loaded into memory, but these are tiny compared to what's not loaded into memory (postings, stored docs, term vectors). > As s

Re: OutOfMemoryError using IndexWriter

2009-06-24 Thread Michael McCandless
On Wed, Jun 24, 2009 at 10:18 AM, stefan wrote: > > Hi, > > >>OK so this means it's not a leak, and instead it's just that stuff is >>consuming more RAM than expected. > Or that my test db is smaller than the production db which is indeed the case. But a "leak" would keep leaking over time, right?

RE: OutOfMemoryError using IndexWriter

2009-06-24 Thread Sudarsan, Sithu D.
: stefan [mailto:ste...@intermediate.de] Sent: Wednesday, June 24, 2009 10:23 AM To: java-user@lucene.apache.org Subject: AW: OutOfMemoryError using IndexWriter Hi, does Lucene keep the complete index in memory ? As stated before the result index is 50MB, this would correlate with the memory

AW: OutOfMemoryError using IndexWriter

2009-06-24 Thread stefan
some hint, whether this is the case, from the programming side would be appreciated ... Stefan -Ursprüngliche Nachricht- Von: Sudarsan, Sithu D. [mailto:sithu.sudar...@fda.hhs.gov] Gesendet: Mi 24.06.2009 16:18 An: java-user@lucene.apache.org Betreff: RE: OutOfMemoryError using

RE: OutOfMemoryError using IndexWriter

2009-06-24 Thread Sudarsan, Sithu D.
open. Please post your results/views. Sincerely, Sithu -Original Message- From: stefan [mailto:ste...@intermediate.de] Sent: Wednesday, June 24, 2009 10:08 AM To: java-user@lucene.apache.org Subject: AW: OutOfMemoryError using IndexWriter Hi, I do use Win32. What do you mean by

AW: OutOfMemoryError using IndexWriter

2009-06-24 Thread stefan
Hi, >OK so this means it's not a leak, and instead it's just that stuff is >consuming more RAM than expected. Or that my test db is smaller than the production db which is indeed the case. >Hmm -- there are quite a few buffered deletes pending. It could be we >are under-accounting for RAM used

AW: OutOfMemoryError using IndexWriter

2009-06-24 Thread stefan
sendet: Mi 24.06.2009 15:55 An: java-user@lucene.apache.org Betreff: RE: OutOfMemoryError using IndexWriter Hi Stefan, Are you using Windows 32 bit? If so, sometimes, if the index file before optimizations crosses your jvm memory usage settings (if say 512MB), there is a possibility of this

AW: OutOfMemoryError using IndexWriter

2009-06-24 Thread stefan
IndexWriter for the complete indexing operation, I do not call optimize but get an OOMError. Stefan -Ursprüngliche Nachricht- Von: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com] Gesendet: Mi 24.06.2009 14:22 An: java-user@lucene.apache.org Betreff: Re: OutOfMemoryError using

RE: OutOfMemoryError using IndexWriter

2009-06-24 Thread Sudarsan, Sithu D.
-2587 sithu.sudar...@fda.hhs.gov sdsudar...@ualr.edu -Original Message- From: stefan [mailto:ste...@intermediate.de] Sent: Wednesday, June 24, 2009 4:09 AM To: java-user@lucene.apache.org Subject: OutOfMemoryError using IndexWriter Hi, I am using Lucene 2.4.1 to index a database with less

Re: OutOfMemoryError using IndexWriter

2009-06-24 Thread Michael McCandless
On Wed, Jun 24, 2009 at 7:43 AM, stefan wrote: > I tried with 100MB heap size and got the Error as well, it runs fine with > 120MB. OK so this means it's not a leak, and instead it's just that stuff is consuming more RAM than expected. > Here is the histogram (application classes marked with --

Re: OutOfMemoryError using IndexWriter

2009-06-24 Thread Otis Gospodnetic
24, 2009 4:08:43 AM > Subject: OutOfMemoryError using IndexWriter > > Hi, > > I am using Lucene 2.4.1 to index a database with less than a million records. > The resulting index is about 50MB in size. > I keep getting an OutOfMemory Error if I re-use the same IndexWriter to

AW: OutOfMemoryError using IndexWriter

2009-06-24 Thread stefan
tefan -Ursprüngliche Nachricht- Von: Michael McCandless [mailto:luc...@mikemccandless.com] Gesendet: Mi 24.06.2009 11:52 An: java-user@lucene.apache.org Betreff: Re: OutOfMemoryError using IndexWriter Hmm -- I think your test env (80 MB heap, 50 MB used by app + 16 MB IndexWriter RAM buffe

Re: OutOfMemoryError using IndexWriter

2009-06-24 Thread Michael McCandless
)      3268608 (size) > > Well, something I should do differently ? > > Stefan > > -Ursprüngliche Nachricht- > Von: Michael McCandless [mailto:luc...@mikemccandless.com] > Gesendet: Mi 24.06.2009 10:48 > An: java-user@lucene.apache.org > Betreff: Re: OutOfMemory

AW: OutOfMemoryError using IndexWriter

2009-06-24 Thread stefan
: OutOfMemoryError using IndexWriter How large is the RAM buffer that you're giving IndexWriter? How large a heap size do you give to JVM? Can you post one of the OOM exceptions you're hitting? Mike On Wed, Jun 24, 2009 at 4:08 AM, stefan wrote: > Hi, > > I am using Lucene 2.4.1 to in

Re: OutOfMemoryError using IndexWriter

2009-06-24 Thread Michael McCandless
How large is the RAM buffer that you're giving IndexWriter? How large a heap size do you give to JVM? Can you post one of the OOM exceptions you're hitting? Mike On Wed, Jun 24, 2009 at 4:08 AM, stefan wrote: > Hi, > > I am using Lucene 2.4.1 to index a database with less than a million records

OutOfMemoryError using IndexWriter

2009-06-24 Thread stefan
Hi, I am using Lucene 2.4.1 to index a database with less than a million records. The resulting index is about 50MB in size. I keep getting an OutOfMemory Error if I re-use the same IndexWriter to index the complete database. This is though recommended in the performance hints. What I now do is

Re: OutOfMemoryError on small search in large, simple index

2008-01-25 Thread jm
I am very interested indeed, do I understand correctly that the tweak you made reduces the memory when searching if you have many docs in the index?? I am omitting norms too. If that is the case, can someone point me to what is hte required change that should be done? I understand from Yoniks comm

Re: OutOfMemoryError on small search in large, simple index

2008-01-08 Thread Lars Clausen
On Mon, 2008-01-07 at 14:20 -0800, Otis Gospodnetic wrote: > Please post your results, Lars! Tried the patch, and it failed to compile (plain Lucene compiled fine). In the process, I looked at TermQuery and found that it'd be easier to copy that code and just hardcode 1.0f for all norms. Did tha

Re: OutOfMemoryError on small search in large, simple index

2008-01-07 Thread Yonik Seeley
On Jan 7, 2008 5:00 AM, Lars Clausen <[EMAIL PROTECTED]> wrote: > Doesn't appear to be the case in our test. We had two fields with > norms, omitting saved only about 4MB for 50 million entries. It should be 50MB. If you are measuring with an external tool, then that tool is probably in error.

Re: OutOfMemoryError on small search in large, simple index

2008-01-07 Thread Otis Gospodnetic
Please post your results, Lars! Thanks, Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Lars Clausen <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Monday, January 7, 2008 5:00:54 AM Subject: Re: OutOfMemoryError on small sea

Re: OutOfMemoryError on small search in large, simple index

2008-01-07 Thread Lars Clausen
On Tue, 2008-01-01 at 23:38 -0800, Chris Hostetter wrote: > : On Wed, 2007-12-12 at 11:37 +0100, Lars Clausen wrote: > > : Seems there's a reason we still use all this memory: > : SegmentReader.fakeNorms() creates the full-size array for us anyway, so > : the memory usage cannot be avoided as lon

Re: OutOfMemoryError on small search in large, simple index

2008-01-01 Thread Chris Hostetter
: On Wed, 2007-12-12 at 11:37 +0100, Lars Clausen wrote: : Seems there's a reason we still use all this memory: : SegmentReader.fakeNorms() creates the full-size array for us anyway, so : the memory usage cannot be avoided as long as somebody asks for the : norms array at any point. The solution

Re: OutOfMemoryError on small search in large, simple index

2007-12-12 Thread Lars Clausen
On Wed, 2007-12-12 at 11:37 +0100, Lars Clausen wrote: > I've now made trial runs with no norms on the two indexed fields, and > also tried with varying TermIndexIntervals. Omitting the norms saves > about 4MB on 50 million entries, much less than I expected. Seems there's a reason we still use

Re: OutOfMemoryError on small search in large, simple index

2007-12-12 Thread Lars Clausen
On Wed, 2007-12-12 at 11:37 +0100, Lars Clausen wrote: > Increasing > the TermIndexInterval by a factor of 4 gave no measurable savings. Following up on myself because I'm not 100% sure that the indexes have the term index intervals I expect, and I'd like to check. Where can I see what term ind

Re: OutOfMemoryError on small search in large, simple index

2007-12-12 Thread Lars Clausen
On Tue, 2007-11-13 at 07:26 -0800, Chris Hostetter wrote: > : > Can it be right that memory usage depends on size of the index rather > : > than size of the result? > : > : Yes, see IndexWriter.setTermIndexInterval(). How much RAM are you giving to > : the JVM now? > > and in general: yes. Luc

Re: OutOfMemoryError on small search in large, simple index

2007-11-13 Thread Chris Hostetter
: > Can it be right that memory usage depends on size of the index rather : > than size of the result? : : Yes, see IndexWriter.setTermIndexInterval(). How much RAM are you giving to : the JVM now? and in general: yes. Lucene is using memory so that *lots* of searches can be fast ... if you r

Re: OutOfMemoryError on small search in large, simple index

2007-11-13 Thread Daniel Naber
On Dienstag, 13. November 2007, Lars Clausen wrote: > Can it be right that memory usage depends on size of the index rather > than size of the result? Yes, see IndexWriter.setTermIndexInterval(). How much RAM are you giving to the JVM now? Regards Daniel -- http://www.danielnaber.de ---

OutOfMemoryError on small search in large, simple index

2007-11-13 Thread Lars Clausen
We've run into a blocking problem with our use of Lucene: we get OutOfMemoryError when performing a one-term search in our index. The search, if completed, should give only a few thousand hits, but from inspecting a heap dump it appears that many more documents in the index get stored in L

OutOfMemoryError: allocLargeArray

2007-09-13 Thread testn
://www.nabble.com/OutOfMemoryError%3A-allocLargeArray-tf4435037.html#a12652765 Sent from the Lucene - Java Users mailing list archive at Nabble.com. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL

SV: SV: OutOfMemoryError tokenizing a boring text file

2007-09-11 Thread Per Lindberg
> Från: Chris Hostetter [mailto:[EMAIL PROTECTED] > : Setting writer.setMaxFieldLength(5000) (default is 1) > : seems to eliminate the risk for an OutOfMemoryError, > > that's because it now gives up after parsing 5000 tokens. > > : To me, it appears that simpl

Re: SV: OutOfMemoryError tokenizing a boring text file

2007-09-03 Thread Chris Hostetter
: Setting writer.setMaxFieldLength(5000) (default is 1) : seems to eliminate the risk for an OutOfMemoryError, that's because it now gives up after parsing 5000 tokens. : To me, it appears that simply calling :new Field("content", new InputStreamReader(in, "ISO-88

SV: OutOfMemoryError tokenizing a boring text file

2007-09-03 Thread Per Lindberg
Aha, that's interesting. However... Setting writer.setMaxFieldLength(5000) (default is 1) seems to eliminate the risk for an OutOfMemoryError, even with a JVM with only 64 MB max memory. (I have tried larger values for JVM max memory, too). (The name is imho slightly misleading, I

Re: OutOfMemoryError tokenizing a boring text file

2007-09-01 Thread Karl Wettin
tent", in); The text file is large, 20 MB, and contains zillions lines, each with the the same 100-character token. That causes an OutOfMemoryError. Given that all tokens are the *same*, why should this cause an OutOfMemoryError? Shouldn't StandardAnalyzer just chug along and just note

Re: OutOfMemoryError tokenizing a boring text file

2007-09-01 Thread Askar Zaidi
k > > On 8/31/07, Per Lindberg <[EMAIL PROTECTED]> wrote: > > > > I'm creating a tokenized "content" Field from a plain text file > > using an InputStreamReader and new Field("content", in); > > > > The text file is large, 20 MB, a

Re: OutOfMemoryError tokenizing a boring text file

2007-09-01 Thread Erick Erickson
ECTED]> wrote: > > I'm creating a tokenized "content" Field from a plain text file > using an InputStreamReader and new Field("content", in); > > The text file is large, 20 MB, and contains zillions lines, > each with the the same 100-character tok

OutOfMemoryError tokenizing a boring text file

2007-08-31 Thread Per Lindberg
I'm creating a tokenized "content" Field from a plain text file using an InputStreamReader and new Field("content", in); The text file is large, 20 MB, and contains zillions lines, each with the the same 100-character token. That causes an OutOfMemoryError. Given that

RE: OutOfMemoryError while enumerating through reader.terms(fieldName)

2006-05-02 Thread Ramana Jelda
Thx for ur quick reply. I will go through it. Rgds, Jelda > -Original Message- > From: mark harwood [mailto:[EMAIL PROTECTED] > Sent: Tuesday, May 02, 2006 5:03 PM > To: java-user@lucene.apache.org > Subject: RE: OutOfMemoryError while enumerating through > read

RE: OutOfMemoryError while enumerating through reader.terms(fieldName)

2006-05-02 Thread mark harwood
"Category counts" should really be a FAQ entry. There is no one right solution to prescribe because it depends on the shape of your data. For previous discussions/code samples see here: http://www.mail-archive.com/java-user@lucene.apache.org/msg05123.html and here for more space-efficient repre

RE: OutOfMemoryError while enumerating through reader.terms(fieldName)

2006-05-02 Thread Ramana Jelda
mailto:[EMAIL PROTECTED] > Sent: Tuesday, May 02, 2006 4:41 PM > To: java-user@lucene.apache.org > Subject: RE: OutOfMemoryError while enumerating through > reader.terms(fieldName) > > I am trying to implement category count almost similar to > CNET approach. > At the initia

  1   2   >