I've opened LUCENE-2135. Mike
On Tue, Dec 8, 2009 at 5:36 AM, Michael McCandless <luc...@mikemccandless.com> wrote: > This is a rather disturbing implementation detail of WeakHashMap, that > it needs the one extra step (invoking one of its methods) for its weak > keys to be reclaimable. > > Maybe on IndexReader.close(), Lucene should go and evict all entries > in the FieldCache associated with that reader. Ie, step through the > sub-readers, and if they are truly closed as well (not shared w/ other > readers), evict. I'll open an issue. > > Even in TCK's code fragment, it's not until the final line is done > executing, that the cache key even loses all hard references, because > it's that line that assigns to sSearcher, replacing the strong > reference to the old searcher. Inserting sSearcher = null prior to > that would drop the hard reference sooner, but because of this impl > detail of WeakHashMap, something would still have to touch it (eg, a > warmup query that hits the field cache) before it's reclaimable. > > Mike > > On Mon, Dec 7, 2009 at 7:38 PM, Tom Hill <solr-l...@worldware.com> wrote: >> Hi - >> >> If I understand correctly, WeakHashMap does not free the memory for the >> value (cached data) when the key is nulled, or even when the key is garbage >> collected. >> >> It requires one more step: a method on WeakHashMap must be called to allow >> it to release its hard reference to the cached data. It appears that most >> methods in WeakHashMap end up calling expungeStaleEntries, which will clear >> the hard reference. But you have to call some method on the map, before the >> memory is eligible for garbage collection. >> >> So it requires four stages to free the cached data. Null the key; A GC to >> release the weak reference to the key; A call to some method on the map; >> Then the next GC cycle should free the value. >> >> So it seems possible that you could end up with double memory usage for a >> time. If you don't have a GC between the time that you close the old reader, >> and you start to load the field cache entry for the next reader, then the >> key may still be hanging around uncollected. >> >> At that point, it may run a GC when you allocate the new cache, but that's >> only the first GC. It can't free the cached data until after the next call >> to expungeStaleEntries, so for a while you have both caches around. >> >> This extra usage could cause things to move into tenured space. Could this >> be causing your problem? >> >> Workaround would be to cause some method to be called on the WeakHashMap. >> You don't want to call get(), since that will try to populate the cache. >> Maybe if you tried putting a small value to the cache, and doing a GC, and >> see if your memory drops then. >> >> >> Tom >> >> >> >> On Mon, Dec 7, 2009 at 1:48 PM, TCK <moonwatcher32...@gmail.com> wrote: >> >>> Thanks for the response. But I'm definitely calling close() on the old >>> reader and opening a new one (not using reopen). Also, to simplify the >>> analysis, I did my test with a single-threaded requester to eliminate any >>> concurrency issues. >>> >>> I'm doing: >>> sSearcher.getIndexReader().close(); >>> sSearcher.close(); // this actually seems to be a no-op >>> IndexReader newIndexReader = IndexReader.open(newDirectory); >>> sSearcher = new IndexSearcher(newIndexReader); >>> >>> Btw, isn't it bad practice anyway to have an unbounded cache? Are there any >>> plans to replace the HashMaps used for the innerCaches with an actual >>> size-bounded cache with some eviction policy (perhaps EhCache or something) >>> ? >>> >>> Thanks again, >>> TCK >>> >>> >>> >>> >>> On Mon, Dec 7, 2009 at 4:37 PM, Erick Erickson <erickerick...@gmail.com >>> >wrote: >>> >>> > What this sounds like is that you're not really closing your >>> > readers even though you think you are. Sorting indeed uses up >>> > significant memory when it populates internal caches and keeps >>> > it around for later use (which is one of the reasons that warming >>> > queries matter). But if you really do close the reader, I'm pretty >>> > sure the memory should be GC-able. >>> > >>> > One thing that trips people up is IndexReader.reopen(). If it >>> > returns a reader different than the original, you *must* close the >>> > old one. If you don't, the old reader is still hanging around and >>> > memory won't be returne.... An example from the Javadocs... >>> > >>> > IndexReader reader = ... >>> > ... >>> > IndexReader new = r.reopen(); >>> > if (new != reader) { >>> > ... // reader was reopened >>> > reader.close(); >>> > } >>> > reader = new; >>> > ... >>> > >>> > >>> > If this is irrelevant, could you post your close/open >>> > >>> > code? >>> > >>> > HTH >>> > >>> > Erick >>> > >>> > >>> > On Mon, Dec 7, 2009 at 4:27 PM, TCK <moonwatcher32...@gmail.com> wrote: >>> > >>> > > Hi, >>> > > I'm having heap memory issues when I do lucene queries involving >>> sorting >>> > by >>> > > a string field. Such queries seem to load a lot of data in to the heap. >>> > > Moreover lucene seems to hold on to references to this data even after >>> > the >>> > > index reader has been closed and a full GC has been run. Some of the >>> > > consequences of this are that in my generational heap configuration a >>> lot >>> > > of >>> > > memory gets promoted to tenured space each time I close the old index >>> > > reader >>> > > and after opening and querying using a new one, and the tenured space >>> > > eventually gets fragmented causing a lot of promotion failures >>> resulting >>> > in >>> > > jvm hangs while the jvm does stop-the-world GCs. >>> > > >>> > > Does anyone know any workarounds to avoid these memory issues when >>> doing >>> > > such lucene queries? >>> > > >>> > > My profiling showed that even after a full GC lucene is holding on to a >>> > lot >>> > > of references to field value data notably via the >>> > > FieldCacheImpl/ExtendedFieldCacheImpl. I noticed that the WeakHashMap >>> > > readerCaches are using unbounded HashMaps as the innerCaches and I used >>> > > reflection to replace these innerCaches with dummy empty HashMaps, but >>> > > still >>> > > I'm seeing the same behavior. I wondered if anyone has gone through >>> these >>> > > same issues before and would offer any advice. >>> > > >>> > > Thanks a lot, >>> > > TCK >>> > > >>> > >>> >> > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org