[
https://issues.apache.org/jira/browse/SOLR-13579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16894184#comment-16894184
]
Hoss Man commented on SOLR-13579:
---------------------------------
Honestly, i'm still very lost.
Part of my struggle is i'm trying to wade into the patch, and review the APIs
and functionality it contains, while knowing – as you mentioned – that's not
all the details are here, and it's not fully fleshed out w/everything you
intend as far as configuration and customization and having more concrete
implementations beyond just the {{CacheManagerPlugin}}.
I know that in your mind there is more that can/should be done, and that some
of this code is just "placeholder" for later, but i don't have enough
familiarity with the "long term" plan to really understand what in the current
patch is placeholder or stub APIs, vs what is "real" and exists because of long
term visions for how all of these pieces can be used together in a more
generalized system – ie: what classes might have surface APIs that look more
complex then needed given what's currently implemented in the patch, because of
how you envinsion those classes being used in the future?
Just to pick one example, was my question about the "ResourceManagerPool" vs
"ResourceManagerPlugin" – in your reply you said...
{quote}The code in ResourceManagerPool is independent of the type of
resource(s) that a pool can manage. ...
{quote}
...but the code in {{ResourceManagerPlugin}} is _also_ independent of any
specific type of resource(s) that a pool can manage – those specifics only
exist in the concrete subclasses. Hence the crux of my question is why theses
two very generalized pieces of abstract functionality/data collection couldn't
just be a single abstract base class for all (concrete) ResourceManagerPlugin
subclasses to extend?
Your followup gives a clue...
{quote}...perhaps at some point we could allow a single pool to manage several
aspects of a component, in which case a pool could have several plugins.
{quote}
but w/o some "concrete hypothetical" examples of what that might look like,
it's hard to evaluate if the current APIs are the "best" approach, or if maybe
there is something better/simpler.
{quote}Also, there can be different pools of the same type, each used for a
different group of components that support the same management aspect. For
example, for searcher caches we may want to eventually create separate pools
for filterCache, queryResultCache and fieldValueCache. All of these pools would
use the same plugin implementation CacheManagerPlugin but configured with
different params and limits.
{quote}
But even in this situation, there could be multiple *instances* of a
{{CacheManagerPlugin}}, one for each pool, each with different params and
limits, w/o needing distinction between the {{ResourceManagerPlugin}}
concept/instances and the {{ResourceManagerPool}} concept/instances.
(To be clear, i'm not trying to harp on the specific design/seperation/linkage
of {{ResourceManagerPlugin}} vs {{ResourceManagerPool}} – these are just some
of the first classes i looked at and had questions about. I'm just using them
as examples of where/how it's hard to ask questions or form opinions about the
current API/code w/o having a better grasp of some "concrete specifcs" (or even
"hypothetical specifics") of when/how/where/why each of these APIs are expected
to be used and interact w/each other.
Another example of where i got lost as to the specific motivation behind some
of these APIs in the long term view is in the "loose coupling" that currently
exists in the patch between the {{ManagedComponent}} API and
{{ResourceManagerPlugin}}:
As i understand it:
* An object in Solr supports being managed by a particular subclass of
{{ResourceManagerPlugin}} if and only if it extends {{ManagedComponent}} and
implementes {{ManagedComponent.getManagedResourceTypes()}} such that the
resulting {{Collection<String>}} contains a String matching the return value of
a {{ResourceManagerPlugin.getType()}} for that particular
{{ResourceManagerPlugin}}
** ie: {{SolrCache}} extends the {{ManagedComponent}} interface, and all
classess implementeing {{SolrCache}} should/must implement
{{getManagedResourceTypes()}} by returning a java {{Collection}} containing
{{CacheManagerPlugin.TYPE}}
* once some {{ManagedComponent}} instances are "registered in a pool" and
managed by a specific {{ResourceManagerPlugin}} intsance then that plugin
expects to be able to call {{ManagedComponent.setResourceLimits(Map<String,
Object> limits)}} and {{ManagedComponent.getResourceLimits()}} on all of those
{{ManagedComponent}} instances, and that both Maps should contain/support a set
of {{String}} keys specific to that {{ResourceManagerPlugin}} subclass acording
to {{ResourceManagerPlugin.getControlledParams()}}
** ie: {{CacheManagerPlugin.getControlledParams()}} returns a java
{{Collection}} containing {{SolrCache.MAX_RAM_MB_PARAM}} and
{{SolrCache.MAX_SIZE_PARAM}} which are also used in a "switch style" if/else
block in {{LRUCache.setResourceLimit(...)}} to adjust the corrisponding
(private, strongly typed) internal variables.
...but why is this coupling so loose? why is each implementation of
{{ManagedComponent}} left to fend for itself in terms of building a "switch"
statement to process the {{Map<String,Object}} inputs it might recieve, instead
of having more type specific sub-classes like {{ManagedCacheComponent extends
ManagedComponent}} that defines all the get/set methods a
{{CacheManagerPlugin}} expects to be able to call on any
{{ManagedCacheComponent}} registered with it – at which point even the
_registration_ could be staticly typed, w/o the need for indirectly comparing
Strings to {{ResourceManagerPlugin.getType()}} ?
(Again: I'm not saying this loose coupling is inherently bad and should be
replaced, I'm saying that I don't have enough grasp on the specifics of the
hypothetical functionality you want to add later that might take advantage of
this loose coupling to understand if it's strictly neccessary; so i'm not even
sure i understand what the right questions to ask are – these are just the
questions that occur to be diving into the code at random. My gut says using
more marker interfaces and staticly typed methods will help reduce "human
error" bugs in the code.)
----
Ignoring the internal API/design, I'm also still confused a little bit about
how exactly we want/expect a system like this to work and "do stuff" with
something like a "cache memory" pool – and was hoping for more specifics on
what you envisioned for the "Cluster Administrator's" UX as pools are
created/configured/used in story form.
In your Story #1, and subsequent reply to another comment, you mentioned...
{quote}In order to do this we need a control mechanism that is able to adjust
individual cache sizes per core, based on the total hard limit and the actual
current "need" of a core, defined as a combination of hit ratio, QPS, and other
arbitrary quality factors / SLA. This control mechanism also needs to be able
to forcibly reduce excessive usage ...
...
* the plugin is executed periodically to check the current resource usage of
all registered caches, using eg. the aggregated value of ramBytesUsed.
* as a result of this action some of the cache content will be evicted sooner
and more aggressively than initially configured, thus freeing more RAM.
* when the memory pressure decreases the CacheManagerPlugin re-adjusts the
maxRamMB settings of each cache to the initially configured values. ...
...
{quote}does that imply that once SolrCache(s) are part of a "pool" they no
longer have their own max size(s)?
{quote}
They still do - but it's used as the starting point for proportional
adjustments.
{quote}
You seem to be saying that the {{CacheManagerPlugin}} would only ever reduce
the {{maxRamMB}} setting of some caches at run time, if/when the sum of
{{ramBytesUsed}} for all caches exceeds the pools {{maxRamMB}} ... but it seems
like in order for that to happen, there's one of two unstated implications;
either:
* users who want to use these pools need to change the individual cache's
configured {{maxRamMB}} to be much higher then they are today. (potentially to
the same value as the {{maxRamMB}} of the pool?)
* OR: that we only expect the plugin to kick in and affect caches if/when the
number of _cores_ increases. (as a result of collection creation of autoscaling)
If for example "Bob" has a single core w/3 caches each configured with
{{maxRamMB=4GB}}, and Bob sets' up a pool for all caches on his system
configured with {{maxRamMB=12GB}} – then in theory, unless more caches are
added to the pool (by more cores being created on this node) that pool/plugin
is never going to adjust the sizes of those caches because the aggregated sum
of the {{ramBytesUsed}} should never exceed 12GB.
Since I assume the idea is that these pools & {{ResourceManagerPlugins}} will
be useful under varying _query_ load, and not just for taking action when
provisioning new collections/cores, I gather the expectation is that when Bob
decides to enable a {{CacheManagerPlugin}} pool, Bob should configure the
individual caches in that with (relatively) "high" {{maxRamMB}} sizes, so that
they could individually use a more/less RAM then eachother and later be
"reigned in" by the pool's {{CacheManagerPlugin}} ... so perhaps in Bob's
solrconfig.xml all 3 caches have {{maxRamMB=8GB}}, while the pool has
{{maxRamMB=12GB}}. If/when the {{CacheManagerPlugin}} detects that Bob's user's
have filled cacheXX to 5GB, cacheYY to 6GB, and cacheZZ to 2GB, it sees the
total = 13GB > 12GB and adjusts the {{maxRamMB}} of the individual caches down
proportionately so that the new {{maxRamMB}} total == 12GB.
But if that's the case, then (based on the typical usage patterns of
SolrCaches) I don't understand your comment about "when the memory pressure
decreases" ... how/when can/should a {{CacheManagerPlugin}} assume/recognize
that the memory pressure has decreased? Because whatever specific new
{{maxRamMB}} values those caches get configured with, they are going to keep
using roughly that much RAM (with some small variation in Query/DocSet/DocList
size) as new requests come in – evicting objects only to replace them with
(similarly sized) ... there's really no reason to assume/expect the
{{ramBytesUsed}} of a cache to _decrease_ in a meaningful way once it's full.
(ie: collections/cores that contains mostly small documents don't tend to
suddenly get a lot of large documents added to them that significantly impact
the size of the documentCache. collections/cores that have big filterCaches
don't tend to suddenly stop getting requests with {{fq}} params and no longer
have a need for a big filterCache, etc...)
So when/how/why exactly would a {{CacheManagerPlugin}} ever "re-adjusts the
maxRamMB settings of each cache to the initially configured values." ?
----
{quote}consistently use the name "component" instead of the confusing "resource"
{quote}
Hmmm, Did you mean to upload a diff patch? the latests i see (#12975831) still
contains lots of new class names refering to "Resource" instead of "Component"
...
{noformat}
$ ls solr/core/src/java/org/apache/solr/managed/*Resource*
solr/core/src/java/org/apache/solr/managed/AbstractResourceManagerPlugin.java
solr/core/src/java/org/apache/solr/managed/DefaultResourceManager.java
solr/core/src/java/org/apache/solr/managed/DefaultResourceManagerPluginFactory.java
solr/core/src/java/org/apache/solr/managed/DefaultResourceManagerPool.java
solr/core/src/java/org/apache/solr/managed/NoOpResourceManager.java
solr/core/src/java/org/apache/solr/managed/ResourceManager.java
solr/core/src/java/org/apache/solr/managed/ResourceManagerPluginFactory.java
solr/core/src/java/org/apache/solr/managed/ResourceManagerPlugin.java
solr/core/src/java/org/apache/solr/managed/ResourceManagerPool.java
{noformat}
> Create resource management API
> ------------------------------
>
> Key: SOLR-13579
> URL: https://issues.apache.org/jira/browse/SOLR-13579
> Project: Solr
> Issue Type: New Feature
> Reporter: Andrzej Bialecki
> Assignee: Andrzej Bialecki
> Priority: Major
> Attachments: SOLR-13579.patch, SOLR-13579.patch, SOLR-13579.patch,
> SOLR-13579.patch, SOLR-13579.patch, SOLR-13579.patch
>
>
> Resource management framework API supporting the goals outlined in SOLR-13578.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]