> It's an apache license - but you mentioned something about no third party > libraries. Is that a policy for Lucene?
Pretty much... Though one can always submit a patch anyways. On Mon, Dec 7, 2009 at 4:57 PM, Tom Hill <solr-l...@worldware.com> wrote: > Hey, that's a nice little Class! I hadn't see it before. But it sounds like > the asynchronous cleanup might deal with the problem I mentioned above (but > I haven't looked at the code yet). > > It's an apache license - but you mentioned something about no third party > libraries. Is that a policy for Lucene? > > Thanks, > > Tom > > > On Mon, Dec 7, 2009 at 4:44 PM, Jason Rutherglen <jason.rutherg...@gmail.com >> wrote: > >> I wonder if Google Collections (even though we don't use third party >> libraries) concurrent map, which supports weak keys, handles the >> removal of weakly referenced keys in a more elegant way than Java's >> WeakHashMap? >> >> On Mon, Dec 7, 2009 at 4:38 PM, Tom Hill <solr-l...@worldware.com> wrote: >> > Hi - >> > >> > If I understand correctly, WeakHashMap does not free the memory for the >> > value (cached data) when the key is nulled, or even when the key is >> garbage >> > collected. >> > >> > It requires one more step: a method on WeakHashMap must be called to >> allow >> > it to release its hard reference to the cached data. It appears that most >> > methods in WeakHashMap end up calling expungeStaleEntries, which will >> clear >> > the hard reference. But you have to call some method on the map, before >> the >> > memory is eligible for garbage collection. >> > >> > So it requires four stages to free the cached data. Null the key; A GC to >> > release the weak reference to the key; A call to some method on the map; >> > Then the next GC cycle should free the value. >> > >> > So it seems possible that you could end up with double memory usage for a >> > time. If you don't have a GC between the time that you close the old >> reader, >> > and you start to load the field cache entry for the next reader, then the >> > key may still be hanging around uncollected. >> > >> > At that point, it may run a GC when you allocate the new cache, but >> that's >> > only the first GC. It can't free the cached data until after the next >> call >> > to expungeStaleEntries, so for a while you have both caches around. >> > >> > This extra usage could cause things to move into tenured space. Could >> this >> > be causing your problem? >> > >> > Workaround would be to cause some method to be called on the WeakHashMap. >> > You don't want to call get(), since that will try to populate the cache. >> > Maybe if you tried putting a small value to the cache, and doing a GC, >> and >> > see if your memory drops then. >> > >> > >> > Tom >> > >> > >> > >> > On Mon, Dec 7, 2009 at 1:48 PM, TCK <moonwatcher32...@gmail.com> wrote: >> > >> >> Thanks for the response. But I'm definitely calling close() on the old >> >> reader and opening a new one (not using reopen). Also, to simplify the >> >> analysis, I did my test with a single-threaded requester to eliminate >> any >> >> concurrency issues. >> >> >> >> I'm doing: >> >> sSearcher.getIndexReader().close(); >> >> sSearcher.close(); // this actually seems to be a no-op >> >> IndexReader newIndexReader = IndexReader.open(newDirectory); >> >> sSearcher = new IndexSearcher(newIndexReader); >> >> >> >> Btw, isn't it bad practice anyway to have an unbounded cache? Are there >> any >> >> plans to replace the HashMaps used for the innerCaches with an actual >> >> size-bounded cache with some eviction policy (perhaps EhCache or >> something) >> >> ? >> >> >> >> Thanks again, >> >> TCK >> >> >> >> >> >> >> >> >> >> On Mon, Dec 7, 2009 at 4:37 PM, Erick Erickson <erickerick...@gmail.com >> >> >wrote: >> >> >> >> > What this sounds like is that you're not really closing your >> >> > readers even though you think you are. Sorting indeed uses up >> >> > significant memory when it populates internal caches and keeps >> >> > it around for later use (which is one of the reasons that warming >> >> > queries matter). But if you really do close the reader, I'm pretty >> >> > sure the memory should be GC-able. >> >> > >> >> > One thing that trips people up is IndexReader.reopen(). If it >> >> > returns a reader different than the original, you *must* close the >> >> > old one. If you don't, the old reader is still hanging around and >> >> > memory won't be returne.... An example from the Javadocs... >> >> > >> >> > IndexReader reader = ... >> >> > ... >> >> > IndexReader new = r.reopen(); >> >> > if (new != reader) { >> >> > ... // reader was reopened >> >> > reader.close(); >> >> > } >> >> > reader = new; >> >> > ... >> >> > >> >> > >> >> > If this is irrelevant, could you post your close/open >> >> > >> >> > code? >> >> > >> >> > HTH >> >> > >> >> > Erick >> >> > >> >> > >> >> > On Mon, Dec 7, 2009 at 4:27 PM, TCK <moonwatcher32...@gmail.com> >> wrote: >> >> > >> >> > > Hi, >> >> > > I'm having heap memory issues when I do lucene queries involving >> >> sorting >> >> > by >> >> > > a string field. Such queries seem to load a lot of data in to the >> heap. >> >> > > Moreover lucene seems to hold on to references to this data even >> after >> >> > the >> >> > > index reader has been closed and a full GC has been run. Some of the >> >> > > consequences of this are that in my generational heap configuration >> a >> >> lot >> >> > > of >> >> > > memory gets promoted to tenured space each time I close the old >> index >> >> > > reader >> >> > > and after opening and querying using a new one, and the tenured >> space >> >> > > eventually gets fragmented causing a lot of promotion failures >> >> resulting >> >> > in >> >> > > jvm hangs while the jvm does stop-the-world GCs. >> >> > > >> >> > > Does anyone know any workarounds to avoid these memory issues when >> >> doing >> >> > > such lucene queries? >> >> > > >> >> > > My profiling showed that even after a full GC lucene is holding on >> to a >> >> > lot >> >> > > of references to field value data notably via the >> >> > > FieldCacheImpl/ExtendedFieldCacheImpl. I noticed that the >> WeakHashMap >> >> > > readerCaches are using unbounded HashMaps as the innerCaches and I >> used >> >> > > reflection to replace these innerCaches with dummy empty HashMaps, >> but >> >> > > still >> >> > > I'm seeing the same behavior. I wondered if anyone has gone through >> >> these >> >> > > same issues before and would offer any advice. >> >> > > >> >> > > Thanks a lot, >> >> > > TCK >> >> > > >> >> > >> >> >> > >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org