[ https://issues.apache.org/jira/browse/SOLR-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17351751#comment-17351751 ]
David Smiley commented on SOLR-11882: ------------------------------------- [~ab] I'm looking very closely at the SolrDispatchFilter.close/destroy process to ensure I understand every detail thoroughly. This {{metricManager.unregisterGauges}} call was added in this JIRA issue SOLR-11882. I'm guessing this change was not contributing to the memory leak but other parts of this JIRA issue were because this memory leak is about individual SolrCore leaks? Could/should it go to CoreContainer.close? Why does SolrDispatchFilter have a reference to the MetricManager; can't it simply look it up via CoreContainer.getMetricManager? Less state is better IMO. > SolrMetric registries retain references to SolrCores when closed > ---------------------------------------------------------------- > > Key: SOLR-11882 > URL: https://issues.apache.org/jira/browse/SOLR-11882 > Project: Solr > Issue Type: Bug > Components: metrics, Server > Affects Versions: 7.1 > Reporter: Eros Taborelli > Assignee: Andrzej Bialecki > Priority: Major > Fix For: 7.4, 8.0 > > Attachments: SOLR-11882-7x.patch, SOLR-11882.patch, SOLR-11882.patch, > SOLR-11882.patch, SOLR-11882.patch, SOLR-11882.patch, SOLR-11882.patch, > create-cores.zip, solr-dump-full_Leak_Suspects.zip, solr.config.zip > > > *Description:* > Our setup involves using a lot of small cores (possibly hundred thousand), > but working only on a few of them at any given time. > We already followed all recommendations in this guide: > [https://wiki.apache.org/solr/LotsOfCores] > We noticed that after creating/loading around 1000-2000 empty cores, with no > documents inside, the heap consumption went through the roof despite having > set transientCacheSize to only 64 (heap size set to 12G). > All cores are correctly set to loadOnStartup=false and transient=true, and we > have verified via logs that the cores in excess are actually being closed. > However, a reference remains in the > org.apache.solr.metrics.SolrMetricManager#registries that is never removed > until a core if fully unloaded. > Restarting the JVM loads all cores in the admin UI, but doesn't populate the > ConcurrentHashMap until a core is actually fully loaded. > I reproduced the issue on a smaller scale (transientCacheSize = 5, heap size > = 512m) and made a report (attached) using eclipse MAT. > *Desired outcome:* > When a transient core is closed, the references in the SolrMetricManager > should be removed, in the same fashion the reporters for the core are also > closed and removed. > In alternative, a unloadOnClose=true|false flag could be implemented to fully > unload a transient core when closed due to the cache size. > *Note:* > The documentation mentions everywhere that the unused cores will be unloaded, > but it's misleading as the cores are never fully unloaded. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org