yujun777 opened a new pull request, #26229:
URL: https://github.com/apache/doris/pull/26229

   Colocate table balance only inside a group, not between groups. This may 
cause a little imbalance.
   For example,   suppose bucket num = 3, three BE A/B/C,  two group 
group1/group2, then we have:
   
   A [ group1:bucket0,  group2:bucket0]
   B [ group1:bucket1,  group2:bucket1]
   C [ group1:bucket2,  group2:bucket2]
   
   If we add a new BE D, for each group: 
bucketNum(A)=bucketNum(B)=bucketNum(C)=1,  bucketNum(D)=0, 
   so each group is balance, but in global groups view, it's not balance. we 
should move one of the buckets to D.
   
   To balance between all groups, we should also compare all the buckets  the 
BEs.
   
   We run a test:   create 100 groups,   replica num = 3,   backend num = 4.  
then we add two new backends.
   
   After colocate balance,
   
   We calcute each backend's total replica num (max/min smaller is better):
   
   |case|backend min replica num|backend max replica num| max / min|
   |----|----|----|----|
   |OLD|2465|3163|1.196|
   |PR| 3082|3098|1.005|
   
   We calcute each backend's total data size (max/min smaller is better):
   
   |case|backend min data size|backend max data size| max / min|
   |----|----|----|----|
   |OLD|8661132902400|10341685657600|1.194|
   |PR| 9814566502400|10866707660800|1.107|   
   
   This PR also fix problem belows:
   1.  Relocate a bucket to a backend which couldn't  hold the bucket's data. 
This group's sched will be stucked;
   2.  Relocate exception when sort the backends. 
   
   ## Proposed changes
   
   Issue Number: close #xxx
   
   <!--Describe your changes.-->
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

Reply via email to