[
https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15156203#comment-15156203
]
Gus Heck commented on SOLR-8349:
--------------------------------
*WRT #3/derministic behavior*: Here's the use case:
# server is started, it loads a component that loads a file and creates
resource A version 1 into memory
# some time later the file is updated, and these updates need to be deployed
# the new version 2 of the file is deployed to the server and the core is
unloaded
# the core is then loaded again and brought on line and made available to users.
We now cannot predict which version of the resource is available to the users.
If GC occured and the resource was collected between steps 3 and 4 the new
resource will become available as the user would expect. If not, the old
resource will show up on calls to getResource() until a GC occurs in which the
JVM decides to clear the weak reference to it. If the component caches a (hard)
reference to the resource, the new version of the resource will never get
loaded. The previous system without weak references did not allow the old
resource to ever be unloaded (and hence was deterministic). Now the behavior is
a product of GC timing and the internal aspects of how the component was
programmed. I would like to subsequently (in some later patch) make it possible
to refresh the resource in a predictable manner without restarting the whole
node.
*WRT hard references*: I want people to have success not missteps and
re-implementation using my feature :). For this reason I really like the weak
references suggestion you made, but I want to manage it for them and not burden
them with handling it properly. The submitted approach was meant to not bite
the user who writes a component that never holds a reference to the resource.
This would be a reasonable naive implementation for someone who knows nothing
about the internals of solr and assumed they shouldn't hold the reference to
ensure that the same resource was always seen everywhere.
*WRT the abstraction*: it's there to get the loading code added to the
deferredCallables list. SolrResourceLoader has no knowledge of the SolrCore
until the core calls inform(core) on it. Unfortunately inform(resourceLoader)
gets called before that. So any attempt to cast and do
((SolrResourceLoader)loader).getCore().getContainer() in the implementation of
ResourceLoader#inform(loader) will throw an NPE. That's why the
deferredCallables list exists. I chose to add the abstraction to enable the
loader/core to manage hard references and allow the processing to become
uniform with all loads being deferred. I wanted the folks attempting to use
this to have a clear intuitive path to do so and the interfaces are meant to
guide them into doing the right thing without needing to know all the details.
It's worth noting that if the goal is a simple patch, the way to eliminate the
MOST complexity from the patch is to have the component author manage
references, and change:
{code}
resourceLoader.inform(resourceLoader);
resourceLoader.inform(this); // last call before the latch is released.
{code}
to
{code}
resourceLoader.inform(this);
resourceLoader.inform(resourceLoader); // last call before the latch is
released.
{code}
In that case, casting and navigating to the container in inform(ResourceLoader)
will work and we can loose the abstractions, the deferred callables and
associated latch/synchronization, and the object reference code goes away
too... but I definitely don't feel qualified to change the order in which
components are made aware of things. I have no idea if any code out there would
be relying on this order of inform() calls in some way.
Lastly, Object key's are certainly possible, though this does reintroduce a
vector for class loader memory leakages as previously discussed. I left this
out because we were not supporting the lucene analyzers yet, and I wasn't yet
adding "automatic" keys from configuration nodes. Automatic keys would be a
nice feature to improve the feature and ensure implementors don't need to think
so hard to use it. I'm amenable to try adding that now if you like, though the
option to supply one's own key should remain.
> Allow sharing of large in memory data structures across cores
> -------------------------------------------------------------
>
> Key: SOLR-8349
> URL: https://issues.apache.org/jira/browse/SOLR-8349
> Project: Solr
> Issue Type: Improvement
> Components: Server
> Affects Versions: 5.3
> Reporter: Gus Heck
> Attachments: SOLR-8349.patch, SOLR-8349.patch
>
>
> In some cases search components or analysis classes may utilize a large
> dictionary or other in-memory structure. When multiple cores are loaded with
> identical configurations utilizing this large in memory structure, each core
> holds it's own copy in memory. This has been noted in the past and a specific
> case reported in SOLR-3443. This patch provides a generalized capability, and
> if accepted, this capability will then be used to fix SOLR-3443.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]