incorrect hits when using multiple threads

2010-03-20 Thread Ruben Laguna
Hi, I'm getting incorrect results from IndexSearcher, hopefully somebody can give me a hand. I have a single IndexWriter instance shared by several threads that invoke addDocument on the IW. I also have another thread that invokes commit() periodically (every 10s). Then I have another thread that

Re: incorrect hits when using multiple threads

2010-03-20 Thread Ruben Laguna
: > On Sat, Mar 20, 2010 at 11:52 AM, Ruben Laguna > wrote: > > Hi, > > I'm getting incorrect results from IndexSearcher, hopefully somebody can > > give me a hand. > > > > I have a single IndexWriter instance shared by several threads that > invoke >

IndexWriter memory leak?

2010-04-07 Thread Ruben Laguna
Hi, It seems like my IndexWriter after commiting and optimizing has a retained size of 140Mb. See [1] for a screenshot of the heapdump analysis done with Eclipse MAT. Of those 140MB 67MB are retained by analyzer.tokenStreams.hardRefs.table.HashMap$Entry.value.tokenStream.scanner.zzBuffer why is

Re: IndexWriter memory leak?

2010-04-07 Thread Ruben Laguna
27;s Reader in my wrapper when the wrapper is closed. But I still think that I would be nicer if IndexWriter wouldn't maintain references to the Readers after indexing. [3] http://img.skitch.com/20100407-ntn2kg13fx49wx4q118bp9h1hb.jpg On Wed, Apr 7, 2010 at 10:35 PM, Ruben Laguna wrote: >

Re: IndexWriter memory leak?

2010-04-08 Thread Ruben Laguna
hich can cause the > buffer to grow that much. > > Shai > > On Thu, Apr 8, 2010 at 1:23 AM, Ruben Laguna > wrote: > > > I want to add that it tried this in both 2.9.0 and 3.0.1 and I got the > same > > "leaky" behavior. > > > > See [3] for

Re: IndexWriter memory leak?

2010-04-08 Thread Ruben Laguna
I was investigating this a little further and in the JFlex mailing list I found [1] I don't know much about flex / JFlex but it seems that this guy resets the zzBuffer to 16384 or less when setting the input for the lexer Quoted from shef I set %buffer 0 in the options section, and then ad

Re: IndexWriter memory leak?

2010-04-08 Thread Ruben Laguna
l are rewritten. > > - > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > > > > -Original Message- > > From: Ruben Laguna [mailto:ruben.lag...@gmail.com] > > Sent: Thursday, April 08, 2010 11:

Re: IndexWriter memory leak?

2010-04-08 Thread Ruben Laguna
t; > > received (0 !), I doubt JFlex would consider it a problem. But we can > > > do > > > some small service to our users base by protecting against such > > > problems. > > > > > > And while you're opening the issue, if you want to take a st

Re: IndexWriter memory leak?

2010-04-08 Thread Ruben Laguna
And by the way, when is Lucene 3.1 coming? On Thu, Apr 8, 2010 at 1:27 PM, Ruben Laguna wrote: > Now that the zzBuffer issue is solved... > > what about the references to the Readers held by docWriter. Tika´s > ParsingReaders are quite heavyweight so retaining those in memory >

Re: IndexWriter memory leak?

2010-04-08 Thread Ruben Laguna
llee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > > > > -Original Message- > > From: Ruben Laguna [mailto:ruben.lag...@gmail.com] > > Sent: Thursday, April 08, 2010 1:28 PM > > To: java-user@lucene.apache.org > > Subject: Re: In

Re: IndexWriter memory leak?

2010-04-08 Thread Ruben Laguna
Yeah, I checked again and IndexWriter is holding references to the Reader, I'm afraid. I opened bug report https://issues.apache.org/jira/browse/LUCENE-2387 to track this down. On Thu, Apr 8, 2010 at 2:50 PM, Ruben Laguna wrote: > I will double check in the afternoon the heapdump.hpro

Re: IndexWriter memory leak?

2010-04-08 Thread Ruben Laguna
//img.skitch.com/20100407-b86irkp7e4uif2wq1dd4t899qb.jpg > > > > On Thu, Apr 8, 2010 at 2:16 PM, Uwe Schindler wrote: > > > > > Readers are not held. If you indexed the document and gced the > > document > > > instance they readers are gone. > > > > > &g

Re: java.lang.OutOfMemoryError: GC overhead limit exceeded

2010-04-09 Thread Ruben Laguna
Take a memory snapshot with JConsole -> dumpHeap [1] and the analyze it with Eclipse MAT [2]. Find the biggest objects and look at their path to GC roots to see if lucene is actually retaining them. You may also want to look to two recently closed bug reports about memory leaks [3] and [4] [1] htt