kangkaisen opened a new issue #546: Colocate Join table balance bug
URL: https://github.com/apache/incubator-doris/issues/546
 
 
   **Describe the bug**
   Yestoday, I expanded 5 BEs to one prod cluster. when colocate table balance, 
there was a error
   ```
   2019-01-16 12:44:31,901 INFO 341 
[ColocateTableBalancer.checkGroupTablets():221] AddedBackendIds [63256529, 
63256540, 63256553] fo
   r colocate group 55245478
   2019-01-16 12:44:31,901 INFO 341 
[ColocateTableBalancer.handleBackendAdded():491] handleBackendAdded start
   2019-01-16 12:44:31,901 INFO 341 
[ColocateTableBalancer.handleBackendAdded():505] for colocate group 55245478, 
needMoveBucketSeqs
   : 5 , bucketSeqPerNewBackend: 1
   2019-01-16 12:44:31,901 ERROR 341 
[ColocateTableBalancer.handleBackendAdded():556] Index: 3, Size: 3
   java.lang.IndexOutOfBoundsException: Index: 3, Size: 3
           at java.util.ArrayList.rangeCheck(ArrayList.java:653) ~[?:1.8.0_112]
           at java.util.ArrayList.get(ArrayList.java:429) ~[?:1.8.0_112]
           at 
org.apache.doris.clone.ColocateTableBalancer.handleBackendAdded(ColocateTableBalancer.java:530)
 [palo-fe.jar:?]
           at 
org.apache.doris.clone.ColocateTableBalancer.checkGroupTablets(ColocateTableBalancer.java:222)
 [palo-fe.jar:?]
           at 
org.apache.doris.clone.ColocateTableBalancer.runOneCycle(ColocateTableBalancer.java:80)
 [palo-fe.jar:?]
           at org.apache.doris.common.util.Daemon.run(Daemon.java:96) 
[palo-fe.jar:?]
   ```
   
   This bug is obvious.
   
   After I fixed this bug, After several hours, I found all colocate groups was 
still balancing. I looked the log,  **found the colocate meta has been wrong**!
   
   ```
   2019-01-16 16:12:34,496 INFO 2682 
[ColocateTableBalancer.checkBalancingGroups():89] colocate group: 55245478 
backendsPerBucketSeq
   is [[6913774, 21833567, 63256487, 63256517, 63256529], [15310, 6913774, 
63256487, 63256517, 63256540], [21833568, 23694, 63256487,
    63256517, 63256553], [23693, 3820567, 63256487, 63256517, 63256529], 
[18711, 3820568, 63256540], [21833566, 10477683, 21833567],
   [18710, 18711, 23695], [23693, 10477683, 18710], [23695, 23694, 10469551], 
[3820567, 21820, 21833565], [23694, 10477683, 21833566]
   , [10002, 21833567, 3820567], [10469551, 3820568, 21833566], [23694, 
21833564, 21820], [10002, 10477683, 21833567], [18709, 10002,
    23694], [10002, 23695, 21820], [18709, 23694, 15310], [10477683, 10469551, 
3820568], [18711, 10469551, 18710], [21833564, 2183356
   7, 18711], [3820567, 10469551, 6913774], [15310, 21833566, 23693], [15310, 
21833567, 21833564], [23695, 15310, 18711], [6913774, 2
   1833567, 23695], [21833567, 23694, 15310], [21833568, 21833565, 23694], 
[3820567, 21833566, 18711], [23693, 21833567, 21833568], [
   3820567, 23695, 18709], [18711, 21833568, 21833564]]
   ```
   The replicationNum for the colocate group is 3. so the backends for each 
BucketSeq should be 3.
   
   This reason is I added new BE one by one and the interval is long, and 
ColocateTableBalancer doesn't skip balance when the colocate group has been 
balancing.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@doris.apache.org
For additional commands, e-mail: dev-h...@doris.apache.org

Reply via email to