Re: OutOfMemoryError indexing large documents

2014-11-26 Thread ryanb
100MB of text for a single lucene document, into a single analyzed field. The analyzer is basically the StandardAnalyzer, with minor changes: 1. UAX29URLEmailTokenizer instead of the StandardTokenizer. This doesn't split URLs and email addresses (so we can do it ourselves in the next step). 2. Spli

Re: OutOfMemoryError indexing large documents

2014-11-26 Thread ryanb
I've had success limiting the number of documents by size, and doing them 1 at a time works OK with 2G heap. I'm also hoping to understand why memory usage would be so high to begin with, or maybe this is expected? I agree that indexing 100+M of text is a bit silly, but the use case is a legal con

Re: OutOfMemoryError indexing large documents

2014-11-26 Thread Trejkaz
On Wed, Nov 26, 2014 at 2:09 PM, Erick Erickson wrote: > Well > 2> seriously consider the utility of indexing a 100+M file. Assuming > it's mostly text, lots and lots and lots of queries will match it, and > it'll score pretty low due to length normalization. And you probably > can't return it to

Re: OutOfMemoryError indexing large documents

2014-11-26 Thread Jack Krupansky
Is that 100MB for a single Lucene document? And is that 100MB for a single field? Is that field analyzed text? How complex is the analyzer? Like, does it do ngrams or something else that is token or memory intensive? Posting the analyzer might help us see what the issue might be. Try indexing

Re: OutOfMemoryError indexing large documents

2014-11-25 Thread Erick Erickson
Well 1> don't send 20 docs at once. Or send docs over some size N by themselves. 2> seriously consider the utility of indexing a 100+M file. Assuming it's mostly text, lots and lots and lots of queries will match it, and it'll score pretty low due to length normalization. And you probably can't re

Re:  OutOfMemoryError throwed by SimpleMergedSegmentWarmer

2014-09-13 Thread 308181687
o disable norms, do we need to rebuild our index entirely? > By the way, We have 8 million documents and our jvm heap is 5G.‍ > > > Thanks & Best Regards! > > > ‍ > > > > -- Original -- > From: "Michael McCandless"

Re: OutOfMemoryError throwed by SimpleMergedSegmentWarmer

2014-09-13 Thread Michael McCandless
We have 8 million documents and our jvm heap is 5G.‍ > > > Thanks & Best Regards! > > > ‍ > > > > -- Original -- > From: "Michael McCandless";; > Date: Sat, Sep 13, 2014 06:29 PM > To: "Lucene Users"; &

Re:   OutOfMemoryError throwed by SimpleMergedSegmentWarmer

2014-09-13 Thread 308181687
e norms, do we need to rebuild our index entirely? By the way, We have 8 million documents and our jvm heap is 5G.‍ Thanks & Best Regards! ‍ -- Original -- From: "Michael McCandless";; Date: Sat, Sep 13, 2014 06:29 PM To: "Lucene Us

Re:  OutOfMemoryError throwed by SimpleMergedSegmentWarmer

2014-09-13 Thread 308181687
documents and our jvm heap is 5G.‍ Thanks & Best Regards! ‍ -- Original -- From: "Michael McCandless";; Date: Sat, Sep 13, 2014 06:29 PM To: "Lucene Users"; Subject: Re: OutOfMemoryError throwed by SimpleMergedSegmentWarmer The w

Re: OutOfMemoryError throwed by SimpleMergedSegmentWarmer

2014-09-13 Thread Michael McCandless
The warmer just tries to load norms/docValues/etc. for all fields that have them enabled ... so this is likely telling you an IndexReader would also hit OOME. You either need to reduce the number of fields you have indexed, or at least disable norms (takes 1 byte per doc per indexed field regardle

Re: OutOfMemoryError when opening the index ?

2012-06-13 Thread Yang
ok, found it: we are using Cloudera CDHu3u, they change the ulimit for child jobs. but I still don't know how to change their default settings yet On Wed, Jun 13, 2012 at 2:15 PM, Yang wrote: > I got the OutOfMemoryError when I tried to open an Lucene index. > > it's very weird since this is o

Re: OutOfMemoryError

2011-10-19 Thread Tamara Bobic
, Tamara - Original Message - > From: "Otis Gospodnetic" > To: java-user@lucene.apache.org > Sent: Tuesday, October 18, 2011 11:14:12 PM > Subject: Re: OutOfMemoryError > > Bok Tamara, > > You didn't say what -Xmx value you are using.  Try a little higher &

RE: OutOfMemoryError

2011-10-18 Thread Uwe Schindler
Hi, > ...I get around 3 > million hits. Each of the hits is processed and information from a certain > field is > used. Thats of course fine, but: > After certain number of hits, somewhere around 1 million (not always the same > number) I get OutOfMemory exception that looks like this: You did

Re: OutOfMemoryError

2011-10-18 Thread Mead Lai
Tamara, You may use StringBuffer instead of String docText = hits.doc(j).getField("DOCUMENT").stringValue() ; after that you may use StringBuffer.delete() to release memery. Another way is using x64-bit machine. Regards, Mead On Wed, Oct 19, 2011 at 5:14 AM, Otis Gospodnetic < otis_gospodne...@

Re: OutOfMemoryError

2011-10-18 Thread Otis Gospodnetic
Bok Tamara, You didn't say what -Xmx value you are using.  Try a little higher value.  Note that loading field values (and it looks like this one may be big because is compressed) from a lot of hits is not recommended. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene e

Re: OutOfMemoryError with FSDirectory

2011-04-05 Thread Michael McCandless
Try 1) reducing the RAM buffer of your IndexWriter (IndexWriter.setRAMBufferSizeMB), 2) using a term divisor when opening your reader (pass 2 or 3 or 4 as termInfosIndexDivisor when opening IndexReader), and 3) disabling norms or not indexing as many fields as possible. 70Mb is not that much RAM t

Re: OutOfMemoryError with FSDirectory

2011-04-04 Thread Claudio
Ok Erick, Thanks for your quick answer. FSDirectory will, indeed, store the index on disk. However, when *using* that index, lots of stuff happens. Specifically: When indexing, there is a buffer that accumulates documents until it's flushed to disk. Are you indexing? When searching (and this

Re: OutOfMemoryError with FSDirectory

2011-04-04 Thread Erick Erickson
FSDirectory will, indeed, store the index on disk. However, when *using* that index, lots of stuff happens. Specifically: When indexing, there is a buffer that accumulates documents until it's flushed to disk. Are you indexing? When searching (and this is the more important part), various caches a

Re: OutOfMemoryError

2010-03-06 Thread Monique Monteiro
Hi Otis, no, I don't use sort. But I use TopFieldCollector and I have to instantiate a Sort object with new Sort(). The data are returned unsorted. On Fri, Mar 5, 2010 at 7:38 PM, Otis Gospodnetic wrote: > Maybe it's not a leak, Monique. :) > If you use sorting in Lucene, then the FieldCache

Re: OutOfMemoryError

2010-03-05 Thread Otis Gospodnetic
Maybe it's not a leak, Monique. :) If you use sorting in Lucene, then the FieldCache object will keep some data permanently in memory, for example. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Hadoop ecosystem search :: http://search-hadoop.com/ - Original Message -

Re: OutOfMemoryError when using Sort

2009-11-12 Thread Nuno Seco
ww.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Nuno Seco [mailto:ns...@dei.uc.pt] Sent: Thursday, November 12, 2009 6:08 PM To: java-user@lucene.apache.org Subject: Re: OutOfMemoryError when using Sort Ok. Thanks. The doc. says: "Finds the top |n| hits for |que

RE: OutOfMemoryError when using Sort

2009-11-12 Thread Uwe Schindler
should be enough for 5-grams). - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Nuno Seco [mailto:ns...@dei.uc.pt] > Sent: Thursday, November 12, 2009 6:08 PM > To: java-user@lucene.apache.org &

Re: OutOfMemoryError when using Sort

2009-11-12 Thread Jake Mannix
It is only sorting the top 50 hits, yes, but do do that, it needs to look at the *value* of the field for each and every of the billions of documents. You can do this without using memory if you're willing to deal with disk seeks, but doing billions of those are going to mean that this query most

Re: OutOfMemoryError when using Sort

2009-11-12 Thread Nuno Seco
Ok. Thanks. The doc. says: "Finds the top |n| hits for |query|, applying |filter| if non-null, and sorting the hits by the criteria in |sort|." I understood that only the hits (50 in this) for the current search would be sorted... I'll just do the ordering afterwards. Thank you for clarifyin

Re: OutOfMemoryError when using Sort

2009-11-12 Thread Jake Mannix
Sorting utilizes a FieldCache: the forward lookup - the value a document has for a particular field (as opposed to the usual "inverted" way of looking at all documents which contains a given term), which lives in memory, and takes up as much space as one 4-bytes * numDocs. If you've indexed the en

RE: OutOfMemoryError when using Sort

2009-11-12 Thread Uwe Schindler
To sort on the count field must be indexed (but not tokenized), it does not need to be stored. But In any case, sort needs lots of memory. How many documents do you have? Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original M

Re: OutOfMemoryError using IndexWriter

2009-06-25 Thread Michael McCandless
Michael McCandless [mailto:luc...@mikemccandless.com] > Gesendet: Do 25.06.2009 13:13 > An: java-user@lucene.apache.org > Betreff: Re: OutOfMemoryError using IndexWriter > > Can you post your test code?  If you can make it a standalone test, > then I can repro and dig down faster.

Re: OutOfMemoryError using IndexWriter

2009-06-25 Thread Michael McCandless
OK it looks like no merging was done. I think the next step is to call IndexWriter.setMaxBufferedDeleteTerms(1000) and see if that prevents the OOM. Mike On Thu, Jun 25, 2009 at 7:16 AM, stefan wrote: > Hi, > > Here are the result of CheckIndex. I ran this just after I got the OOError. > > OK [4

Re: OutOfMemoryError using IndexWriter

2009-06-25 Thread Simon Willnauer
rays, I will need some >> more time for this. >> >> Stefan >> >> -Ursprüngliche Nachricht- >> Von: Michael McCandless [mailto:luc...@mikemccandless.com] >> Gesendet: Mi 24.06.2009 17:50 >> An: java-user@lucene.apache.org >> Betreff: Re: OutO

Re: OutOfMemoryError using IndexWriter

2009-06-25 Thread Michael McCandless
time for this. > > Stefan > > -Ursprüngliche Nachricht- > Von: Michael McCandless [mailto:luc...@mikemccandless.com] > Gesendet: Mi 24.06.2009 17:50 > An: java-user@lucene.apache.org > Betreff: Re: OutOfMemoryError using IndexWriter > > On Wed, Jun 24, 2009 at 10:1

Re: OutOfMemoryError using IndexWriter

2009-06-25 Thread Michael McCandless
On Thu, Jun 25, 2009 at 3:02 AM, stefan wrote: >>But a "leak" would keep leaking over time, right?  Ie even a 1 GB heap >>on your test db should eventually throw OOME if there's really a leak. > No not necessarily, since I stop indexing ones everything is indexed - I > shall try repeated runs wit

Re: OutOfMemoryError using IndexWriter

2009-06-24 Thread Michael McCandless
On Wed, Jun 24, 2009 at 10:23 AM, stefan wrote: > does Lucene keep the complete index in memory ? No. Certain things (deleted docs, norms, field cache, terms index) are loaded into memory, but these are tiny compared to what's not loaded into memory (postings, stored docs, term vectors). > As s

Re: OutOfMemoryError using IndexWriter

2009-06-24 Thread Michael McCandless
On Wed, Jun 24, 2009 at 10:18 AM, stefan wrote: > > Hi, > > >>OK so this means it's not a leak, and instead it's just that stuff is >>consuming more RAM than expected. > Or that my test db is smaller than the production db which is indeed the case. But a "leak" would keep leaking over time, right?

RE: OutOfMemoryError using IndexWriter

2009-06-24 Thread Sudarsan, Sithu D.
: Sudarsan, Sithu D. [mailto:sithu.sudar...@fda.hhs.gov] Gesendet: Mi 24.06.2009 16:18 An: java-user@lucene.apache.org Betreff: RE: OutOfMemoryError using IndexWriter When the segments are merged, but not optimized. It happened at 1.8GB to our program, and now we develop and test in Win32 but run the

RE: OutOfMemoryError using IndexWriter

2009-06-24 Thread Sudarsan, Sithu D.
apache.org Betreff: RE: OutOfMemoryError using IndexWriter Hi Stefan, Are you using Windows 32 bit? If so, sometimes, if the index file before optimizations crosses your jvm memory usage settings (if say 512MB), there is a possibility of this happening. Increase JVM memory settings if that i

RE: OutOfMemoryError using IndexWriter

2009-06-24 Thread Sudarsan, Sithu D.
Hi Stefan, Are you using Windows 32 bit? If so, sometimes, if the index file before optimizations crosses your jvm memory usage settings (if say 512MB), there is a possibility of this happening. Increase JVM memory settings if that is the case. Sincerely, Sithu D Sudarsan Off: 301-796-2587

Re: OutOfMemoryError using IndexWriter

2009-06-24 Thread Michael McCandless
On Wed, Jun 24, 2009 at 7:43 AM, stefan wrote: > I tried with 100MB heap size and got the Error as well, it runs fine with > 120MB. OK so this means it's not a leak, and instead it's just that stuff is consuming more RAM than expected. > Here is the histogram (application classes marked with --

Re: OutOfMemoryError using IndexWriter

2009-06-24 Thread Otis Gospodnetic
Hi Stefan, While not directly th source of your problem, I have a feeling you are optimizing too frequently (and wasting time/CPU by doing so). Is there a reason you optimize so often? Try optimizing only at the end, when you know you won't be adding any more documents to the index for a whi

Re: OutOfMemoryError using IndexWriter

2009-06-24 Thread Michael McCandless
)      3268608 (size) > > Well, something I should do differently ? > > Stefan > > -Ursprüngliche Nachricht- > Von: Michael McCandless [mailto:luc...@mikemccandless.com] > Gesendet: Mi 24.06.2009 10:48 > An: java-user@lucene.apache.org > Betreff: Re: OutOfMemory

Re: OutOfMemoryError using IndexWriter

2009-06-24 Thread Michael McCandless
How large is the RAM buffer that you're giving IndexWriter? How large a heap size do you give to JVM? Can you post one of the OOM exceptions you're hitting? Mike On Wed, Jun 24, 2009 at 4:08 AM, stefan wrote: > Hi, > > I am using Lucene 2.4.1 to index a database with less than a million records

Re: OutOfMemoryError on small search in large, simple index

2008-01-25 Thread jm
I am very interested indeed, do I understand correctly that the tweak you made reduces the memory when searching if you have many docs in the index?? I am omitting norms too. If that is the case, can someone point me to what is hte required change that should be done? I understand from Yoniks comm

Re: OutOfMemoryError on small search in large, simple index

2008-01-08 Thread Lars Clausen
On Mon, 2008-01-07 at 14:20 -0800, Otis Gospodnetic wrote: > Please post your results, Lars! Tried the patch, and it failed to compile (plain Lucene compiled fine). In the process, I looked at TermQuery and found that it'd be easier to copy that code and just hardcode 1.0f for all norms. Did tha

Re: OutOfMemoryError on small search in large, simple index

2008-01-07 Thread Yonik Seeley
On Jan 7, 2008 5:00 AM, Lars Clausen <[EMAIL PROTECTED]> wrote: > Doesn't appear to be the case in our test. We had two fields with > norms, omitting saved only about 4MB for 50 million entries. It should be 50MB. If you are measuring with an external tool, then that tool is probably in error.

Re: OutOfMemoryError on small search in large, simple index

2008-01-07 Thread Otis Gospodnetic
Please post your results, Lars! Thanks, Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Lars Clausen <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Monday, January 7, 2008 5:00:54 AM Subject: Re: OutOfMemoryError on small sea

Re: OutOfMemoryError on small search in large, simple index

2008-01-07 Thread Lars Clausen
On Tue, 2008-01-01 at 23:38 -0800, Chris Hostetter wrote: > : On Wed, 2007-12-12 at 11:37 +0100, Lars Clausen wrote: > > : Seems there's a reason we still use all this memory: > : SegmentReader.fakeNorms() creates the full-size array for us anyway, so > : the memory usage cannot be avoided as lon

Re: OutOfMemoryError on small search in large, simple index

2008-01-01 Thread Chris Hostetter
: On Wed, 2007-12-12 at 11:37 +0100, Lars Clausen wrote: : Seems there's a reason we still use all this memory: : SegmentReader.fakeNorms() creates the full-size array for us anyway, so : the memory usage cannot be avoided as long as somebody asks for the : norms array at any point. The solution

Re: OutOfMemoryError on small search in large, simple index

2007-12-12 Thread Lars Clausen
On Wed, 2007-12-12 at 11:37 +0100, Lars Clausen wrote: > I've now made trial runs with no norms on the two indexed fields, and > also tried with varying TermIndexIntervals. Omitting the norms saves > about 4MB on 50 million entries, much less than I expected. Seems there's a reason we still use

Re: OutOfMemoryError on small search in large, simple index

2007-12-12 Thread Lars Clausen
On Wed, 2007-12-12 at 11:37 +0100, Lars Clausen wrote: > Increasing > the TermIndexInterval by a factor of 4 gave no measurable savings. Following up on myself because I'm not 100% sure that the indexes have the term index intervals I expect, and I'd like to check. Where can I see what term ind

Re: OutOfMemoryError on small search in large, simple index

2007-12-12 Thread Lars Clausen
On Tue, 2007-11-13 at 07:26 -0800, Chris Hostetter wrote: > : > Can it be right that memory usage depends on size of the index rather > : > than size of the result? > : > : Yes, see IndexWriter.setTermIndexInterval(). How much RAM are you giving to > : the JVM now? > > and in general: yes. Luc

Re: OutOfMemoryError on small search in large, simple index

2007-11-13 Thread Chris Hostetter
: > Can it be right that memory usage depends on size of the index rather : > than size of the result? : : Yes, see IndexWriter.setTermIndexInterval(). How much RAM are you giving to : the JVM now? and in general: yes. Lucene is using memory so that *lots* of searches can be fast ... if you r

Re: OutOfMemoryError on small search in large, simple index

2007-11-13 Thread Daniel Naber
On Dienstag, 13. November 2007, Lars Clausen wrote: > Can it be right that memory usage depends on size of the index rather > than size of the result? Yes, see IndexWriter.setTermIndexInterval(). How much RAM are you giving to the JVM now? Regards Daniel -- http://www.danielnaber.de ---

Re: OutOfMemoryError tokenizing a boring text file

2007-09-01 Thread Karl Wettin
I belive the problem is that the text value is not the only data associated with a token, there is for instance the position offset. Depending on your JVM, each instance reference consume 64 bits or so, so even if the text value is flyweighted by String.intern() there is a cost. I doubt tha

Re: OutOfMemoryError tokenizing a boring text file

2007-09-01 Thread Askar Zaidi
I have indexed around 100 M of data with 512M to the JVM heap. So that gives you an idea. If every token is the same word in one file, shouldn't the tokenizer recognize that ? Try using Luke. That helps solving lots of issues. - AZ On 9/1/07, Erick Erickson <[EMAIL PROTECTED]> wrote: > > I can't

Re: OutOfMemoryError tokenizing a boring text file

2007-09-01 Thread Erick Erickson
I can't answer the question of why the same token takes up memory, but I've indexed far more than 20M of data in a single document field. As in on the order of 150M. Of course I allocated 1G or so to the JVM, so you might try that Best Erick On 8/31/07, Per Lindberg <[EMAIL PROTECTED]> wrote:

RE: OutOfMemoryError while enumerating through reader.terms(fieldName)

2006-05-02 Thread Ramana Jelda
Thx for ur quick reply. I will go through it. Rgds, Jelda > -Original Message- > From: mark harwood [mailto:[EMAIL PROTECTED] > Sent: Tuesday, May 02, 2006 5:03 PM > To: java-user@lucene.apache.org > Subject: RE: OutOfMemoryError while enumerating through > read

RE: OutOfMemoryError while enumerating through reader.terms(fieldName)

2006-05-02 Thread mark harwood
"Category counts" should really be a FAQ entry. There is no one right solution to prescribe because it depends on the shape of your data. For previous discussions/code samples see here: http://www.mail-archive.com/java-user@lucene.apache.org/msg05123.html and here for more space-efficient repre

RE: OutOfMemoryError while enumerating through reader.terms(fieldName)

2006-05-02 Thread Ramana Jelda
mailto:[EMAIL PROTECTED] > Sent: Tuesday, May 02, 2006 4:41 PM > To: java-user@lucene.apache.org > Subject: RE: OutOfMemoryError while enumerating through > reader.terms(fieldName) > > I am trying to implement category count almost similar to > CNET approach. > At the initia

RE: OutOfMemoryError while enumerating through reader.terms(fieldName)

2006-05-02 Thread Ramana Jelda
PM > To: java-user@lucene.apache.org > Subject: RE: OutOfMemoryError while enumerating through > reader.terms(fieldName) > > >>Any advise is relly welcome. > > Don't cache all that data. > You need a minimum of (numUniqueTerms*numDocs)/8 bytes to > hold that info. >

RE: OutOfMemoryError while enumerating through reader.terms(fieldName)

2006-05-02 Thread mark harwood
>>Any advise is relly welcome. Don't cache all that data. You need a minimum of (numUniqueTerms*numDocs)/8 bytes to hold that info. Assuming 10,000 unique terms and 1 million docs you'd need over 1 Gig of RAM. I suppose the question is what are you trying to achieve and why can't you use the exis

RE: OutOfMemoryError while enumerating through reader.terms(fieldName)

2006-05-02 Thread Ramana Jelda
Hi, I just debugged it closely.. Sorry I am getting OutOfMemoryError not because of reader.terms() But because of invoking QueryFilter.bits() method for each unique term. I will try explain u with psuedo code. while(term != null){ if(term.field().equals(name)){ String termText

Re: OutOfMemoryError on addIndexes()

2005-08-19 Thread Tony Schwartz
Aha, it's not initially clear, but after looking at it more closely, I see how it works now. This is very good to know. Tony Schwartz [EMAIL PROTECTED] > Tony Schwartz wrote: >> What about the TermInfosReader class? It appears to read the entire term >> set for the >> segment into 3 arrays

Re: OutOfMemoryError on addIndexes()

2005-08-18 Thread Doug Cutting
Tony Schwartz wrote: What about the TermInfosReader class? It appears to read the entire term set for the segment into 3 arrays. Am I seeing double on this one? p.s. I am looking at the current sources. see TermInfosReader.ensureIndexIsRead(); The index only has 1/128 of the terms, by def

Re: OutOfMemoryError on addIndexes()

2005-08-18 Thread Doug Cutting
Tony Schwartz wrote: I think you're jumping into the conversation too late. What you have said here does not address the problem at hand. That is, in TermInfosReader, all terms in the segment get loaded into three very large arrays. That's not true. Only 1/128th of the terms are loaded by

Re: OutOfMemoryError on addIndexes()

2005-08-18 Thread Paul Elschot
On Thursday 18 August 2005 14:32, Tony Schwartz wrote: > Is this a viable solution? > Doesn't this make sorting and filtering much more complex and much more > expensive as well? Sorting would have to be done on more than one field. I would expect that to be possible. As for filtering: would you

RE: OutOfMemoryError on addIndexes()

2005-08-18 Thread Tony Schwartz
ww.aviransplace.com > > -Original Message- > From: Tony Schwartz [mailto:[EMAIL PROTECTED] > Sent: Thursday, August 18, 2005 8:32 AM > To: java-user@lucene.apache.org > Subject: Re: OutOfMemoryError on addIndexes() > > Is this a viable solution? > Doesn't this m

RE: OutOfMemoryError on addIndexes()

2005-08-18 Thread Mordo, Aviran (EXP N-NANNATEK)
-user@lucene.apache.org Subject: Re: OutOfMemoryError on addIndexes() Is this a viable solution? Doesn't this make sorting and filtering much more complex and much more expensive as well? Tony Schwartz [EMAIL PROTECTED] > On Wednesday 17 August 2005 22:49, Paul Elschot wrote: >> > the i

Re: OutOfMemoryError on addIndexes()

2005-08-18 Thread Tony Schwartz
Is this a viable solution? Doesn't this make sorting and filtering much more complex and much more expensive as well? Tony Schwartz [EMAIL PROTECTED] > On Wednesday 17 August 2005 22:49, Paul Elschot wrote: >> > the index could potentially be huge. >> > >> > So if this is indeed the case, it is

Re: OutOfMemoryError on addIndexes()

2005-08-17 Thread Paul Elschot
On Wednesday 17 August 2005 22:49, Paul Elschot wrote: > > the index could potentially be huge. > > > > So if this is indeed the case, it is a potential scalability > > bottleneck in lucene index size. > > Splitting the date field into century, year in century, month, day, hour, > seconds, and >

Re: OutOfMemoryError on addIndexes()

2005-08-17 Thread Paul Elschot
hanks, > > Tony Schwartz > [EMAIL PROTECTED] > > > > > > From: John Wang <[EMAIL PROTECTED]> > Subject: Re: OutOfMemoryError on addIndexes() > > > > > Under many us

Re: OutOfMemoryError on addIndexes()

2005-08-17 Thread Tony Schwartz
urned me in the past. I am going to start working on and testing a solution to this, but was wondering if anyone had already messed with it or had any ideas up front? Thanks, Tony Schwartz [EMAIL PROTECTED] From: John Wang <[EMAIL PROTECTED]> Subject: Re: OutOfMemoryError on

Re: OutOfMemoryError on addIndexes()

2005-08-16 Thread John Wang
the nature of your indexes? > > > : Date: Fri, 12 Aug 2005 09:45:40 +0200 > : From: Trezzi Michael <[EMAIL PROTECTED]> > : Reply-To: java-user@lucene.apache.org > : To: java-user@lucene.apache.org > : Subject: RE: OutOfMemoryError on addIndexes() > : > : I did some m

RE: OutOfMemoryError on addIndexes()

2005-08-12 Thread Chris Hostetter
aps some binary data is mistakenly getting treated as strings? can you tell us more about the nature of your indexes? : Date: Fri, 12 Aug 2005 09:45:40 +0200 : From: Trezzi Michael <[EMAIL PROTECTED]> : Reply-To: java-user@lucene.apache.org : To: java-user@lucene.apache.org : Subjec

RE: OutOfMemoryError on addIndexes()

2005-08-12 Thread Trezzi Michael
omu: java-user@lucene.apache.org Předmět: Re: OutOfMemoryError on addIndexes() How much memory are you giving your programs? java-Xmxset maximum Java heap size -- Ian. On 10/08/05, Trezzi Michael <[EMAIL PROTECTED]> wrote: > Hello, > I have a problem and i tried everythin

RE: OutOfMemoryError on addIndexes()

2005-08-11 Thread Otis Gospodnetic
m: Otis Gospodnetic [mailto:[EMAIL PROTECTED] > Sent: Thursday, August 11, 2005 11:15 AM > To: java-user@lucene.apache.org > Subject: Re: OutOfMemoryError on addIndexes() > > > > Is -Xmx case sensitive? Should it be 1000m instead of 1000M? > Not > > > sure. >

RE: OutOfMemoryError on addIndexes()

2005-08-11 Thread Aigner, Thomas
would shrink the java memory pool back down to the min? Thanks, Tom -Original Message- From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] Sent: Thursday, August 11, 2005 11:15 AM To: java-user@lucene.apache.org Subject: Re: OutOfMemoryError on addIndexes() > > Is -Xmx case sensitive?

Re: OutOfMemoryError on addIndexes()

2005-08-11 Thread Otis Gospodnetic
> > Is -Xmx case sensitive? Should it be 1000m instead of 1000M? Not > > sure. > > > > I'am starting with: > java -Xms256M -Xmx512M -jar Suchmaschine.jar And if you look at the size of your JVM, does it really use all 512 MB? If it does not, maybe you can try this: java -Xms256m -Xmx512m -j

Re: OutOfMemoryError on addIndexes()

2005-08-11 Thread Harald Stowasser
Otis Gospodnetic schrieb: > Is -Xmx case sensitive? Should it be 1000m instead of 1000M? Not > sure. > I'am starting with: java -Xms256M -Xmx512M -jar Suchmaschine.jar -- Die analytische Maschine (der Computer) kann nur das ausführen, was wir zu programmieren imstande sind. (Ada Lovelace)

RE: OutOfMemoryError on addIndexes()

2005-08-10 Thread Otis Gospodnetic
. > > Michael > > > > Od: Ian Lea [mailto:[EMAIL PROTECTED] > Odesláno: st 10.8.2005 12:34 > Komu: java-user@lucene.apache.org > Pøedmìt: Re: OutOfMemoryError on addIndexes() > > > > How much memory are you giving your programs

RE: OutOfMemoryError on addIndexes()

2005-08-10 Thread Trezzi Michael
Předmět: Re: OutOfMemoryError on addIndexes() How much memory are you giving your programs? java-Xmxset maximum Java heap size -- Ian. On 10/08/05, Trezzi Michael <[EMAIL PROTECTED]> wrote: > Hello, > I have a problem and i tried everything i could think of to solve it. TO

Re: OutOfMemoryError on addIndexes()

2005-08-10 Thread Ian Lea
How much memory are you giving your programs? java-Xmxset maximum Java heap size -- Ian. On 10/08/05, Trezzi Michael <[EMAIL PROTECTED]> wrote: > Hello, > I have a problem and i tried everything i could think of to solve it. TO > understand my situation, i create indexes on several

Re: OutOfMemoryError

2005-07-28 Thread Lasse L
Hi, If I replace my lucene wrapper with a dummy one the problem goes away. If I close my index-thread every 30 minutes and start a new thread it also goes away. If I exit the thread on OutOfMemory errors it regains all memory. I do not use static variables. If I did they wouldn't get garbage colle

Re: OutOfMemoryError

2005-07-13 Thread Ian Lea
Might be interesting to know if it crashed on 2 docs if you ran it with heap size of 512Mb. I guess you've already tried with default merge values. Shouldn't need to optimize after every 100 docs. jdk 1.3 is pretty ancient - can you use 1.5? I'd try it with a larger heap size, and then look