Re: java.lang.OutOfMemoryError: GC overhead limit exceeded

2010-04-14 Thread Michael McCandless
ose to not be called -- really you should move the .close() > calls into a finally clause). > > Mike > > On Mon, Apr 12, 2010 at 10:54 AM, Herbert Roitblat wrote: >> >> Update: >> reusing the reader and searcher made almost no difference. It still eats >> up >&g

Re: java.lang.OutOfMemoryError: GC overhead limit exceeded

2010-04-14 Thread Herbert Roitblat
al Message - From: "Michael McCandless" To: Sent: Tuesday, April 13, 2010 2:46 AM Subject: Re: java.lang.OutOfMemoryError: GC overhead limit exceeded Can you whittle down your example even more? EG don't read the term vectors for the first hit. Just open a single reader and do

Re: java.lang.OutOfMemoryError: GC overhead limit exceeded

2010-04-13 Thread Michael McCandless
ts up > the heap. > - Original Message - From: "Herbert L Roitblat" > To: > Sent: Monday, April 12, 2010 6:50 AM > Subject: Re: java.lang.OutOfMemoryError: GC overhead limit exceeded > > >> Thank you Michael.  Your suggestions are helpful.  I inherited all of >&g

Re: java.lang.OutOfMemoryError: GC overhead limit exceeded

2010-04-13 Thread Michael McCandless
On Mon, Apr 12, 2010 at 9:50 AM, Herbert L Roitblat wrote: > Thank you Michael. Your suggestions are helpful. I inherited all of the > code that uses pyLucene and don't consider myself an expert on it, so I very > much appreciate your suggestions. > > It does not seem to be the case that these e

Re: java.lang.OutOfMemoryError: GC overhead limit exceeded

2010-04-12 Thread Herbert Roitblat
Update: reusing the reader and searcher made almost no difference. It still eats up the heap. - Original Message - From: "Herbert L Roitblat" To: Sent: Monday, April 12, 2010 6:50 AM Subject: Re: java.lang.OutOfMemoryError: GC overhead limit exceeded Thank you Mich

Re: java.lang.OutOfMemoryError: GC overhead limit exceeded

2010-04-12 Thread Herbert L Roitblat
Thank you Michael. Your suggestions are helpful. I inherited all of the code that uses pyLucene and don't consider myself an expert on it, so I very much appreciate your suggestions. It does not seem to be the case that these elements represent the index of the collection. TermInfo and Term

Re: java.lang.OutOfMemoryError: GC overhead limit exceeded

2010-04-12 Thread Michael McCandless
The large count of TermInfo & Term is completely normal -- this is Lucene's term index, which is entirely RAM resident. In 3.1, with flexible indexing, the RAM efficiency of the terms index should be much improved. While opening a new reader/searcher for every query is horribly inefficient, it sh

Re: java.lang.OutOfMemoryError: GC overhead limit exceeded

2010-04-11 Thread Herbert L Roitblat
Hi, Folks. Thanks, Ruben, for your help. It let me get a ways down the road. The problem is the the heap is filling up when I am doing a lucene.TermQuery. What I am trying to accomplish is to get the terms in one field of each document and their frequency in the document. A code snippet i

Re: java.lang.OutOfMemoryError: GC overhead limit exceeded

2010-04-09 Thread Ruben Laguna
Take a memory snapshot with JConsole -> dumpHeap [1] and the analyze it with Eclipse MAT [2]. Find the biggest objects and look at their path to GC roots to see if lucene is actually retaining them. You may also want to look to two recently closed bug reports about memory leaks [3] and [4] [1] htt