Hi, Does this issue has anything to do with the line:
> TopScoreDocCollector collector = new TopScoreDocCollector(100000); if we do: > TopScoreDocCollector collector = new TopScoreDocCollector(2); instead (only see top two documents), could memory usage be less? Best regards, Lisheng -----Original Message----- From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Wednesday, June 10, 2009 5:40 AM To: java-user@lucene.apache.org Subject: Re: Lucene memory usage This (very large number of unique terms) is a problem for Lucene currently. There are some simple improvements we could make to the terms dict format to not require so much RAM per term in the terms index... LUCENE-1458 (flexible indexing) has these improvements, but unfortunately tied in w/ lots of other changes. Maybe we should break out a separate issue for this... this'd be a great contained improvement, if anyone out there has "the itch" :) One simple workaround is to call IndexReader.setTermIndexInterval immediately after opening the reader; this simply loads fewer terms in the index, using far less RAM, but at the expense of somewhat slower searching. Also: you should peek at your index, eg using Luke, to understand why you have so many terms. It could be legitimate (indexing a massive catalog with eg part numbers), or, it could be your document filtering / analyzer are accidentally producing garbage terms. Mike On Wed, Jun 10, 2009 at 8:23 AM, Benedikt Boss<na...@web.de> wrote: > Hej hej, > > i have a question regarding lucenes memory usage > when launching a query. When i execute my query > lucene eats up over 1gig of heap-memory even > when my result-set is only a single hit. I > found out that this is due to the "ensureIndexIsRead()" > method-call in the "TermInfosReader" class, which > iterates over all Terms found in the index and saves > them (including all value-strings) in a Term-Array. > Is it possible to not read all that stuff > into memory at all? > > Im doing the query like in the following pseudo-code: > ------------------------------------------------------------------------ > > TopScoreDocCollector collector = new TopScoreDocCollector(100000); > > QueryParser parser= new QueryParser(field, new WhitespaceAnalyzer() ); > Directory fsDir = new FSDirectory(indexDir, null); > IndexSearcher is = new IndexSearcher(fsdir); > > Query query = parser.parse(q); > > is.search(query, collector); > ScoreDoc[] hits = collector.topDocs(); > > ....... < iterate over hits and print results > > > > Thanks in advance > Benedikt > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org