[ 
https://issues.apache.org/jira/browse/SOLR-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17351751#comment-17351751
 ] 

David Smiley commented on SOLR-11882:
-------------------------------------

[~ab] I'm looking very closely at the SolrDispatchFilter.close/destroy process 
to ensure I understand every detail thoroughly.  This 
{{metricManager.unregisterGauges}} call was added in this JIRA issue 
SOLR-11882.  I'm guessing this change was not contributing to the memory leak 
but other parts of this JIRA issue were because this memory leak is about 
individual SolrCore leaks?  Could/should it go to CoreContainer.close?  Why 
does SolrDispatchFilter have a reference to the MetricManager; can't it simply 
look it up via CoreContainer.getMetricManager?  Less state is better IMO.

> SolrMetric registries retain references to SolrCores when closed
> ----------------------------------------------------------------
>
>                 Key: SOLR-11882
>                 URL: https://issues.apache.org/jira/browse/SOLR-11882
>             Project: Solr
>          Issue Type: Bug
>          Components: metrics, Server
>    Affects Versions: 7.1
>            Reporter: Eros Taborelli
>            Assignee: Andrzej Bialecki
>            Priority: Major
>             Fix For: 7.4, 8.0
>
>         Attachments: SOLR-11882-7x.patch, SOLR-11882.patch, SOLR-11882.patch, 
> SOLR-11882.patch, SOLR-11882.patch, SOLR-11882.patch, SOLR-11882.patch, 
> create-cores.zip, solr-dump-full_Leak_Suspects.zip, solr.config.zip
>
>
> *Description:*
> Our setup involves using a lot of small cores (possibly hundred thousand), 
> but working only on a few of them at any given time.
> We already followed all recommendations in this guide: 
> [https://wiki.apache.org/solr/LotsOfCores]
> We noticed that after creating/loading around 1000-2000 empty cores, with no 
> documents inside, the heap consumption went through the roof despite having 
> set transientCacheSize to only 64 (heap size set to 12G).
> All cores are correctly set to loadOnStartup=false and transient=true, and we 
> have verified via logs that the cores in excess are actually being closed.
> However, a reference remains in the 
> org.apache.solr.metrics.SolrMetricManager#registries that is never removed 
> until a core if fully unloaded.
> Restarting the JVM loads all cores in the admin UI, but doesn't populate the 
> ConcurrentHashMap until a core is actually fully loaded.
> I reproduced the issue on a smaller scale (transientCacheSize = 5, heap size 
> = 512m) and made a report (attached) using eclipse MAT.
> *Desired outcome:*
> When a transient core is closed, the references in the SolrMetricManager 
> should be removed, in the same fashion the reporters for the core are also 
> closed and removed.
> In alternative, a unloadOnClose=true|false flag could be implemented to fully 
> unload a transient core when closed due to the cache size.
> *Note:*
> The documentation mentions everywhere that the unused cores will be unloaded, 
> but it's misleading as the cores are never fully unloaded.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to