[ https://issues.apache.org/jira/browse/ZOOKEEPER-4878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17891018#comment-17891018 ]
Kezhu Wang commented on ZOOKEEPER-4878: --------------------------------------- Do you have more logs so we can know the more context about how are you testing and how it fails ? Personally, I think "unrecoverable" means it demands human operation. If you are going fault IO operation to behave "corrupted", then "unrecoverable" is suitable. > Zookeeper servers not running after Chaos mesh IO fault experiment > ------------------------------------------------------------------ > > Key: ZOOKEEPER-4878 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4878 > Project: ZooKeeper > Issue Type: Bug > Affects Versions: 3.8.3 > Reporter: Dharani > Priority: Major > > We are running zookeeper in kubernetes as stateful set with 3 replicas. when > we performed chaos mesh IO fault experiment, zookeeper servers are not > recovering. > "[QuorumPeer[myid=3](plain=[0:0:0:0:0:0:0:0]:2181)(secure=[0:0:0:0:0:0:0:0]:2281):o.a.z.s.ZooKeeperServer@552] > - Severe unrecoverable error, exiting" > java.io.FileNotFoundException: > /var/lib/zookeeper/data/version-2/snapshot.400000ed9 (Input/output error) -- This message was sent by Atlassian Jira (v8.20.10#820010)