HoustonPutman commented on PR #1650:
URL: https://github.com/apache/solr/pull/1650#issuecomment-1588135823

   Thanks for the review Radu!
   
   > I saw a "parallel" flag, but I think replicas are moved one at a time, 
correct? If so, it will likely prove to be a bottleneck down the road.
   
   The replicas will be moved in parallel, but in the future we definitely 
should add a bottleneck option here. That's a great call-out. Separate PR IMO.
   
   > Am I reading the code right that we compute all the shard movements that 
have to be done, then do them all, then check if we were successful?
   
   Yes, that is correct. And this is also a good future edition I think. We can 
limit the balanceReplicas logic to `x` movements, then implement those 
movements then wait for the cluster to become healthy. We could do that logic 
in a loop until 0 movements are returned. Also a separate PR, but I love the 
idea.
   
   > I saw a maxBalanceSkew property, but I'm not really sure what it does. Is 
this the number of "weight points" difference between nodes under which we say 
"don't bother?" And that would be different based on the placement plugin?
   
   Currently it does absolutely nothing, but yes, the idea is that you move 
things around until the difference between the lowest weight and the highest 
weight is <= `maxBalanceSkew`. I didn't implement it yet, because I'm not sure 
how useful it would be honestly... I was also thinking originally that it would 
return an error if the maxBalanceSkew couldn't be achieved, but once again I'm 
not convinced of a use case here. I should remove it for now, since it is 
unused.
   
   > The balancing code that I saw for the Simple placement plugin seems more 
complicated than the minimizeCores one.
   
   Yeah, simple is more complex than minimizeCores. It was the Legacy 
implementation, but the name "Simple" was chosen over "Legacy".
   
   > In Affinity, we don't seem to really deal with the withCollection case, in 
the sense that we would probably want to move shards of both collections at the 
same time - we check if the destination node has that collection, correct?
   
   That is correct. It makes sure that a move works with the existing replicas, 
but it is not yet smart enough to move replicas together. I have ideas on how 
to improve this down the line, but I don't think its necessary for the first 
implementation.
   
   Basically for `canRemoveReplica(r)`, it would return the list of other 
replicas that would need to be removed along with that replica, and 
`canAddReplica(r)`, it would return the list of other replicas that would need 
to be added along with that replica. Then the orderedNode logic could make it 
work. Once again, this adds some complexity, and it can likely wait for another 
PR.
   
   > Do I get it right that we soft-enforce constraints like AZ-awareness by 
adding big weights? If so, would it be better down the road to add them as 
constraints?
   
   So yeah, they can absolutely be used as constraints, but right now (before 
this PR) they aren't, so this logic is keeping in-line with the logic that 
existed before.
   
   > And I think your PR already does that, no? If I simply create a new 
replica, the placement plugin will put it on the node with the lowest weight, 
correct?
   
   This is correct, and we can definitely add new constraints (like 
shardsPerNode/replicasPerNode) to the different PlacementPlugins. It will be 
especially easy to do given this new framework going forward!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to