[ 
https://issues.apache.org/jira/browse/SOLR-15803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17460950#comment-17460950
 ] 

Houston Putman commented on SOLR-15803:
---------------------------------------

Using similar methodology to 
https://issues.apache.org/jira/browse/SOLR-14656?focusedCommentId=17160311&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17160311,
 I have tested if this affects collection creation times negatively, as the 
autoscaling feature did:

 

!Screen Shot 2021-12-16 at 1.23.40 PM.png|width=509,height=358!

There is a a slight increase in collection creation time (averaging ~200 
milliseconds over the course of 400 collections), but this is very small and 
users should not see an impact.

 

Note I used 10 Solr nodes, each given 2 Gb of memory, running in a local 
Kubernetes cluster via the Solr Operator.

> Allow AssignStrategy to process multiple AssignRequests with 
> cross-coordination
> -------------------------------------------------------------------------------
>
>                 Key: SOLR-15803
>                 URL: https://issues.apache.org/jira/browse/SOLR-15803
>             Project: Solr
>          Issue Type: Improvement
>          Components: SolrCloud
>            Reporter: Houston Putman
>            Assignee: Houston Putman
>            Priority: Major
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> When doing testing for SOLR-15795, I found that if you have an empty node 
> when running the REPLACENODE command, then many times all replicas will be 
> placed on that same node, even if it doesn't result in an even distribution 
> in your cluster.
> When looking at the code, it made sense. The ReplaceNodeCmd goes through a 
> loop for every replica on the sourceNode, and uses the AssignStrategy class 
> to assign a node for each replica, using the clusterstate. However, the 
> clusterstate does not change between these replicas, so the most advantageous 
> node for 1 replica, is likely going to be the most advantageous for many 
> replicas given the same cluster state. Therefore all replicas were being 
> scheduled for the same node in my testing.
> An easy (in theory) solution is to let AssignStrategy take a list of 
> AssignRequests in assign(), and each request in this list will account for 
> the replicaPlacements decided for the previous requests in the list. That 
> way, the ReplaceNodeCmd can create it's list of AssignRequests, and issue 
> them all at once to AssignStrategy, which will come up with the _optimal_ 
> plan for all replicas *together*.
> Because this is an API in assignStrategy, it will work with the new 
> autoscaling APIs or using the legacy assign strategy.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to