[
https://issues.apache.org/jira/browse/KUDU-3325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
yejiabao_h updated KUDU-3325:
-----------------------------
Attachment: image-2021-10-06-15-36-40-996.png
image-2021-10-06-15-36-53-813.png
image-2021-10-06-15-37-09-520.png
image-2021-10-06-15-37-24-776.png
image-2021-10-06-15-37-42-533.png
image-2021-10-06-15-37-54-782.png
image-2021-10-06-15-38-06-575.png
image-2021-10-06-15-38-17-388.png
image-2021-10-06-15-38-29-176.png
image-2021-10-06-15-38-39-852.png
image-2021-10-06-15-38-53-343.png
image-2021-10-06-15-39-03-296.png
Component/s: consensus
Description:
h3. 1、using kudu leader step down to create multiple wal message
./kudu tablet leader_step_down $MASTER_IP 1299f5a939d2453c83104a6db0cae3e7
h4. wal
!image-2021-10-06-15-36-40-996.png!
h4. cmeta
!image-2021-10-06-15-36-53-813.png!
h3. 2、stop one of tserver to start tablet recovery,so that we can make
opid_index flush to cmeta
!image-2021-10-06-15-37-09-520.png!
h4. wal
!image-2021-10-06-15-37-24-776.png!
h4. cmeta
!image-2021-10-06-15-37-42-533.png!
h3. 3、stop all tservers,and delete tablet wal
!image-2021-10-06-15-37-54-782.png!
h3. 4、start all tservers
we can see the index in wal starts counting from 1, but the opid_index recorded
in cmeta is the value 20 which is before deleting wal
h4. wal
!image-2021-10-06-15-38-06-575.png!
h4. cmeta
!image-2021-10-06-15-38-17-388.png!
h3. 5、stop a tserver,trigger fault recovery
!image-2021-10-06-15-38-29-176.png!
when the leader recovery a replica, and master request change raft config to
add the new replica to new raft config, leader replica while ignored because
the opindex is smaller than that in cmeta.
h3. 6、delete all wals
!image-2021-10-06-15-38-39-852.png!
h3. 7、kudu cluster rebalance
./kudu cluster rebalance $MASTER_IP
!image-2021-10-06-15-38-53-343.png!
!image-2021-10-06-15-39-03-296.png!
rebalance is also failed when change raft config
Summary: When wal is deleted, fault recovery and load balancing are
abnormal (was: when wal)
> When wal is deleted, fault recovery and load balancing are abnormal
> -------------------------------------------------------------------
>
> Key: KUDU-3325
> URL: https://issues.apache.org/jira/browse/KUDU-3325
> Project: Kudu
> Issue Type: Bug
> Components: consensus
> Reporter: yejiabao_h
> Priority: Major
> Attachments: image-2021-10-06-15-36-40-996.png,
> image-2021-10-06-15-36-53-813.png, image-2021-10-06-15-37-09-520.png,
> image-2021-10-06-15-37-24-776.png, image-2021-10-06-15-37-42-533.png,
> image-2021-10-06-15-37-54-782.png, image-2021-10-06-15-38-06-575.png,
> image-2021-10-06-15-38-17-388.png, image-2021-10-06-15-38-29-176.png,
> image-2021-10-06-15-38-39-852.png, image-2021-10-06-15-38-53-343.png,
> image-2021-10-06-15-39-03-296.png
>
>
> h3. 1、using kudu leader step down to create multiple wal message
> ./kudu tablet leader_step_down $MASTER_IP 1299f5a939d2453c83104a6db0cae3e7
> h4. wal
> !image-2021-10-06-15-36-40-996.png!
> h4. cmeta
> !image-2021-10-06-15-36-53-813.png!
> h3. 2、stop one of tserver to start tablet recovery,so that we can make
> opid_index flush to cmeta
> !image-2021-10-06-15-37-09-520.png!
> h4. wal
> !image-2021-10-06-15-37-24-776.png!
> h4. cmeta
> !image-2021-10-06-15-37-42-533.png!
> h3. 3、stop all tservers,and delete tablet wal
> !image-2021-10-06-15-37-54-782.png!
> h3. 4、start all tservers
> we can see the index in wal starts counting from 1, but the opid_index
> recorded in cmeta is the value 20 which is before deleting wal
>
> h4. wal
> !image-2021-10-06-15-38-06-575.png!
>
> h4. cmeta
> !image-2021-10-06-15-38-17-388.png!
>
> h3. 5、stop a tserver,trigger fault recovery
> !image-2021-10-06-15-38-29-176.png!
> when the leader recovery a replica, and master request change raft config to
> add the new replica to new raft config, leader replica while ignored because
> the opindex is smaller than that in cmeta.
>
> h3. 6、delete all wals
> !image-2021-10-06-15-38-39-852.png!
> h3. 7、kudu cluster rebalance
> ./kudu cluster rebalance $MASTER_IP
> !image-2021-10-06-15-38-53-343.png!
> !image-2021-10-06-15-39-03-296.png!
> rebalance is also failed when change raft config
--
This message was sent by Atlassian Jira
(v8.3.4#803005)