Re: OutOfMemoryError on small search in large, simple index

2008-01-25 Thread jm
I am very interested indeed, do I understand correctly that the tweak you made reduces the memory when searching if you have many docs in the index?? I am omitting norms too. If that is the case, can someone point me to what is hte required change that should be done? I understand from Yoniks comm

Re: OutOfMemoryError on small search in large, simple index

2008-01-08 Thread Lars Clausen
On Mon, 2008-01-07 at 14:20 -0800, Otis Gospodnetic wrote: > Please post your results, Lars! Tried the patch, and it failed to compile (plain Lucene compiled fine). In the process, I looked at TermQuery and found that it'd be easier to copy that code and just hardcode 1.0f for all norms. Did tha

Re: OutOfMemoryError on small search in large, simple index

2008-01-07 Thread Yonik Seeley
On Jan 7, 2008 5:00 AM, Lars Clausen <[EMAIL PROTECTED]> wrote: > Doesn't appear to be the case in our test. We had two fields with > norms, omitting saved only about 4MB for 50 million entries. It should be 50MB. If you are measuring with an external tool, then that tool is probably in error.

Re: OutOfMemoryError on small search in large, simple index

2008-01-07 Thread Otis Gospodnetic
Please post your results, Lars! Thanks, Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Lars Clausen <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Monday, January 7, 2008 5:00:54 AM Subject: Re: OutOfMemoryError on small sea

Re: OutOfMemoryError on small search in large, simple index

2008-01-07 Thread Lars Clausen
On Tue, 2008-01-01 at 23:38 -0800, Chris Hostetter wrote: > : On Wed, 2007-12-12 at 11:37 +0100, Lars Clausen wrote: > > : Seems there's a reason we still use all this memory: > : SegmentReader.fakeNorms() creates the full-size array for us anyway, so > : the memory usage cannot be avoided as lon

Re: OutOfMemoryError on small search in large, simple index

2008-01-01 Thread Chris Hostetter
: On Wed, 2007-12-12 at 11:37 +0100, Lars Clausen wrote: : Seems there's a reason we still use all this memory: : SegmentReader.fakeNorms() creates the full-size array for us anyway, so : the memory usage cannot be avoided as long as somebody asks for the : norms array at any point. The solution

Re: OutOfMemoryError on small search in large, simple index

2007-12-12 Thread Lars Clausen
On Wed, 2007-12-12 at 11:37 +0100, Lars Clausen wrote: > I've now made trial runs with no norms on the two indexed fields, and > also tried with varying TermIndexIntervals. Omitting the norms saves > about 4MB on 50 million entries, much less than I expected. Seems there's a reason we still use

Re: OutOfMemoryError on small search in large, simple index

2007-12-12 Thread Lars Clausen
On Wed, 2007-12-12 at 11:37 +0100, Lars Clausen wrote: > Increasing > the TermIndexInterval by a factor of 4 gave no measurable savings. Following up on myself because I'm not 100% sure that the indexes have the term index intervals I expect, and I'd like to check. Where can I see what term ind

Re: OutOfMemoryError on small search in large, simple index

2007-12-12 Thread Lars Clausen
On Tue, 2007-11-13 at 07:26 -0800, Chris Hostetter wrote: > : > Can it be right that memory usage depends on size of the index rather > : > than size of the result? > : > : Yes, see IndexWriter.setTermIndexInterval(). How much RAM are you giving to > : the JVM now? > > and in general: yes. Luc

Re: OutOfMemoryError on small search in large, simple index

2007-11-13 Thread Chris Hostetter
: > Can it be right that memory usage depends on size of the index rather : > than size of the result? : : Yes, see IndexWriter.setTermIndexInterval(). How much RAM are you giving to : the JVM now? and in general: yes. Lucene is using memory so that *lots* of searches can be fast ... if you r

Re: OutOfMemoryError on small search in large, simple index

2007-11-13 Thread Daniel Naber
On Dienstag, 13. November 2007, Lars Clausen wrote: > Can it be right that memory usage depends on size of the index rather > than size of the result? Yes, see IndexWriter.setTermIndexInterval(). How much RAM are you giving to the JVM now? Regards Daniel -- http://www.danielnaber.de ---

OutOfMemoryError on small search in large, simple index

2007-11-13 Thread Lars Clausen
We've run into a blocking problem with our use of Lucene: we get OutOfMemoryError when performing a one-term search in our index. The search, if completed, should give only a few thousand hits, but from inspecting a heap dump it appears that many more documents in the index get stored in Lucene dur