[
https://issues.apache.org/jira/browse/KUDU-3325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
yejiabao_h updated KUDU-3325:
-----------------------------
Attachment: image-2021-10-06-19-23-51-769.png
> When wal is deleted, fault recovery and load balancing are abnormal
> -------------------------------------------------------------------
>
> Key: KUDU-3325
> URL: https://issues.apache.org/jira/browse/KUDU-3325
> Project: Kudu
> Issue Type: Bug
> Components: consensus
> Reporter: yejiabao_h
> Priority: Major
> Attachments: image-2021-10-06-15-36-40-996.png,
> image-2021-10-06-15-36-53-813.png, image-2021-10-06-15-37-09-520.png,
> image-2021-10-06-15-37-24-776.png, image-2021-10-06-15-37-42-533.png,
> image-2021-10-06-15-37-54-782.png, image-2021-10-06-15-38-06-575.png,
> image-2021-10-06-15-38-17-388.png, image-2021-10-06-15-38-29-176.png,
> image-2021-10-06-15-38-39-852.png, image-2021-10-06-15-38-53-343.png,
> image-2021-10-06-15-39-03-296.png, image-2021-10-06-19-23-51-769.png
>
>
> h3. 1、using kudu leader step down to create multiple wal message
> ./kudu tablet leader_step_down $MASTER_IP 1299f5a939d2453c83104a6db0cae3e7
> h4. wal
> !image-2021-10-06-15-36-40-996.png!
> h4. cmeta
> !image-2021-10-06-15-36-53-813.png!
> h3. 2、stop one of tserver to start tablet recovery,so that we can make
> opid_index flush to cmeta
> !image-2021-10-06-15-37-09-520.png!
> h4. wal
> !image-2021-10-06-15-37-24-776.png!
> h4. cmeta
> !image-2021-10-06-15-37-42-533.png!
> h3. 3、stop all tservers,and delete tablet wal
> !image-2021-10-06-15-37-54-782.png!
> h3. 4、start all tservers
> we can see the index in wal starts counting from 1, but the opid_index
> recorded in cmeta is the value 20 which is before deleting wal
>
> h4. wal
> !image-2021-10-06-15-38-06-575.png!
>
> h4. cmeta
> !image-2021-10-06-15-38-17-388.png!
>
> h3. 5、stop a tserver,trigger fault recovery
> !image-2021-10-06-15-38-29-176.png!
> when the leader recovery a replica, and master request change raft config to
> add the new replica to new raft config, leader replica while ignored because
> the opindex is smaller than that in cmeta.
>
> h3. 6、delete all wals
> !image-2021-10-06-15-38-39-852.png!
> h3. 7、kudu cluster rebalance
> ./kudu cluster rebalance $MASTER_IP
> !image-2021-10-06-15-38-53-343.png!
> !image-2021-10-06-15-39-03-296.png!
> rebalance is also failed when change raft config
--
This message was sent by Atlassian Jira
(v8.3.4#803005)