[jira] [Commented] (ZOOKEEPER-4878) Zookeeper servers not running after Chaos mesh IO fault experiment

Kezhu Wang (Jira) Fri, 18 Oct 2024 09:35:42 -0700


    [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17891018#comment-17891018
 ]


Kezhu Wang commented on ZOOKEEPER-4878:
---------------------------------------

Do you have more logs so we can know the more context about how are you testing 
and how it fails ?

Personally, I think "unrecoverable" means it demands human operation. If you 
are going fault IO operation to behave "corrupted", then "unrecoverable" is 
suitable.

> Zookeeper servers not running after Chaos mesh IO fault experiment
> ------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-4878
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4878
>             Project: ZooKeeper
>          Issue Type: Bug
>    Affects Versions: 3.8.3
>            Reporter: Dharani
>            Priority: Major
>
> We are running zookeeper in kubernetes as stateful set with 3 replicas. when 
> we performed chaos mesh IO fault experiment, zookeeper servers are not 
> recovering.
> "[QuorumPeer[myid=3](plain=[0:0:0:0:0:0:0:0]:2181)(secure=[0:0:0:0:0:0:0:0]:2281):o.a.z.s.ZooKeeperServer@552]
>  - Severe unrecoverable error, exiting"
> java.io.FileNotFoundException: 
> /var/lib/zookeeper/data/version-2/snapshot.400000ed9 (Input/output error)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (ZOOKEEPER-4878) Zookeeper servers not running after Chaos mesh IO fault experiment

Reply via email to