> LUCENE-1458 (flexible indexing) has these improvements, Mike, can you explain how it's different? I looked through the code once but yeah, it's in with a lot of other changes.
On Wed, Jun 10, 2009 at 5:40 AM, Michael McCandless < luc...@mikemccandless.com> wrote: > This (very large number of unique terms) is a problem for Lucene currently. > > There are some simple improvements we could make to the terms dict > format to not require so much RAM per term in the terms index... > LUCENE-1458 (flexible indexing) has these improvements, but > unfortunately tied in w/ lots of other changes. Maybe we should break > out a separate issue for this... this'd be a great contained > improvement, if anyone out there has "the itch" :) > > One simple workaround is to call IndexReader.setTermIndexInterval > immediately after opening the reader; this simply loads fewer terms in > the index, using far less RAM, but at the expense of somewhat slower > searching. > > Also: you should peek at your index, eg using Luke, to understand why > you have so many terms. It could be legitimate (indexing a massive > catalog with eg part numbers), or, it could be your document filtering > / analyzer are accidentally producing garbage terms. > > Mike > > On Wed, Jun 10, 2009 at 8:23 AM, Benedikt Boss<na...@web.de> wrote: > > Hej hej, > > > > i have a question regarding lucenes memory usage > > when launching a query. When i execute my query > > lucene eats up over 1gig of heap-memory even > > when my result-set is only a single hit. I > > found out that this is due to the "ensureIndexIsRead()" > > method-call in the "TermInfosReader" class, which > > iterates over all Terms found in the index and saves > > them (including all value-strings) in a Term-Array. > > Is it possible to not read all that stuff > > into memory at all? > > > > Im doing the query like in the following pseudo-code: > > ------------------------------------------------------------------------ > > > > TopScoreDocCollector collector = new TopScoreDocCollector(100000); > > > > QueryParser parser= new QueryParser(field, new WhitespaceAnalyzer() ); > > Directory fsDir = new FSDirectory(indexDir, null); > > IndexSearcher is = new IndexSearcher(fsdir); > > > > Query query = parser.parse(q); > > > > is.search(query, collector); > > ScoreDoc[] hits = collector.topDocs(); > > > > ....... < iterate over hits and print results > > > > > > > Thanks in advance > > Benedikt > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >