Patson Luk created SOLR-17076:
---------------------------------

             Summary: Replica Placement could be slow for new collection with 
high amount of shard in a cluster with plenty replicas
                 Key: SOLR-17076
                 URL: https://issues.apache.org/jira/browse/SOLR-17076
             Project: Solr
          Issue Type: Improvement
      Security Level: Public (Default Security Level. Issues are Public)
          Components: SolrCloud
    Affects Versions: 9.3
            Reporter: Patson Luk


It's found in our cluster with hundreds of thousands of replicas that 
collection creation is slow when the new collection has thousands of shards.

In particular there are 4 mins+ computation time spent between the [collection 
initial 
creation|https://github.com/apache/solr/blob/ebcb3b92f6f0b2736d312a83de9d2ccadc0980aa/solr/core/src/java/org/apache/solr/cloud/api/collections/CreateCollectionCmd.java#L115]
 and [the SliceMutator creating 
slice|https://github.com/apache/solr/blob/ebcb3b92f6f0b2736d312a83de9d2ccadc0980aa/solr/core/src/java/org/apache/solr/cloud/api/collections/CreateCollectionCmd.java#L336]

With some profiling and metrics checking, it appears that during those 4 mins, 
almost all of the CPU time is spent in 
{{org.apache.solr.cluster.placement.plugins.OrderedNodePlacementPlugin$WeightedNode.getAllReplicasOnNode}}.

For each new shard, it invokes this method to compute the weight which iterates 
on all collection and shard,  with creation of a new replica set. This 
computation is costly for our environment based on the profiler and CPU metrics.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to