[
https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15147835#comment-15147835
]
Gus Heck commented on SOLR-8349:
--------------------------------
Guava sounded like a great idea. I love guava caches and have used them
frequently in the past. When you mentioned it I thought to myself why didn't I
think of that... eventually I remembered that I had, but decided not to go
there because I didn't want to give control of memory issues to a library. In
reflection that may have been a bit too draconian, so I gave guava caches a
whirl. For the first time in my experience, I think I've found a use case they
don't cover well. My design for this feature desires the following behavior:
# get should return null if the resource is requested when no attempt has been
made to load it.
# load a resource only once, no provision for update or replacement is
presently required, so first one in wins is just fine.
# subsequent attempts to load the resource are a non-blocking no-op, allowing
cores 2 through N that require the resource to continue to configure themselves
while resource is being loaded (possibly starting the loading of resources for
other components).
# loading will be complete before the server completes startup and begins
servicing requests. If a resource was supposed to be loaded and there was no
error during loading, the component can rely on the existence of the resource
at query (or update) time.
# never allow a query or update to solr to initiate the loading for obvious
latency reasons.
I realized that the unit test I supplied doesn't fully test #3 above so I
modified it like this:
{code}
+ final String[] lastThread = new String[1];
// now add something to the map while it's under load
try {
Thread[] set1 = new Thread[100];
Then
calls.incrementAndGet();
- });
+ lastThread[0] = Thread.currentThread().getName();
+ }, "thread"+i); // give the threads names like thread0 thread1 thread2
etc
}
for (Thread thread : set1) {
thread.start();
+ Thread.sleep(10);
}
while (calls.get() < 100) {
Thread.sleep(100);
}
Then
Thread.sleep(100);
long count1b = counter[0];
long count2b = counter2[0];
+
+ // make sure other loaders didn't wait and thread0 completed last
+ assertEquals("thread0", lastThread[0]);
{code}
This modified test still passes just fine with my original code. But so far I
haven't made it pass with guava. The naive first attempt was to use
get(key,loader) and ignore the return value and use getIfPresent(key) to
service get requests (handling case #1) but this was not good:
# get(key,loader) ignoring the return value will block all cores that depend on
this resource. The test fails with loader99 updating the array last.
# wraping the get(key,loader) in a thread fails because now they all complete
before loading completes and the test gets null when it checks the assert that
that the value was set. (it would eventually become set, but this is just like
completing server startup before loading is complete, and then failing queries,
probably with an NPE). Also, failing server startup when loading fails becomes
problematic since we have to get the exception back out of the thread.
After those attempts I found myself getting into tracking what keys have
already had loaders supplied and trying to coordinate blocking/pass through on
my own, and of course the access to the collection recording what I've seen has
to be synchronized... etc. This winds up negating most of the benefit of the
guava cache. This is too bad because at first it looked like guava would
replace a bunch of my code with one liners. My guess is that the guava folks
didn't supply put(k,loader) because most of the time you aren't accepting
arbitrary loaders in the first place. If the loader is known in advance, then
LoadingCache works nicely. Unfortunately, for us, the loader is not known in
advance. I don't see any reasonable way to write "one loader to rule them all"
to allow us to use a LoadingCache either, this would just move all my tracking
code into TheOneLoader.java :). Possibly a good reorganization code wise but we
are still writing threaded code and not getting much simplification out of
using guava caches that way. If we are going to write synchronized code, we
should just make it do the right thing directly I think.
Finally, weakValues() (or some equivalent) sounds good, but the problem is that
the component supplies a loader and the container loads it at time A, and then
at time B the component asks for the result of the load. Even if the component
caches the results of that first request, the time between A and B leaves the
object just loaded vulnerable to GC, since the only thing referencing it is the
weak values cache. One could require the loader to set a reference, but that
just presumes that that particular core won't be unloaded before some other
core finally receives a request that causes it to fetch and cache the resource
(remember only one loader ever runs, so only one component would get the hard
reference initially). It seems that softValues() could be used, but without a
hard reference too we are still potentially in trouble.
# All cores with hard references are unloaded, core X remains which has yet to
acquire a hard reference.
# when memory gets low, the soft resource is GC'd
# request arrives to core X requiring resource.
# either goal #4 or #5 must be violated now.
It's clearly the case that my design goals make things difficult. If I drop
goal #3 (not blocking other cores), guava will drop right in nicely by using
get(key,loader) and ignoring the return value. Which do you feel is more
important, using guava and keeping the code simple or preserving the startup
efficiency for the server when this feature is in use? I found this comment in
CoreContainer which clearly indicates that the intent is to load in parallel,
but also it seems that there's some degree of waiting going on already in cloud
mode:
{code}
// setup executor to load cores in parallel
// do not limit the size of the executor in zk mode since cores may try and
wait for each other.
ExecutorService coreLoadExecutor = ExecutorUtil.newMDCAwareFixedThreadPool(
{code}
> Allow sharing of large in memory data structures across cores
> -------------------------------------------------------------
>
> Key: SOLR-8349
> URL: https://issues.apache.org/jira/browse/SOLR-8349
> Project: Solr
> Issue Type: Improvement
> Components: Server
> Affects Versions: 5.3
> Reporter: Gus Heck
> Attachments: SOLR-8349.patch
>
>
> In some cases search components or analysis classes may utilize a large
> dictionary or other in-memory structure. When multiple cores are loaded with
> identical configurations utilizing this large in memory structure, each core
> holds it's own copy in memory. This has been noted in the past and a specific
> case reported in SOLR-3443. This patch provides a generalized capability, and
> if accepted, this capability will then be used to fix SOLR-3443.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]