lujingyu created ZOOKEEPER-4981: ----------------------------------- Summary: One node has neither crashed nor restarted again, but its information is inconsistent with other nodes Key: ZOOKEEPER-4981 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4981 Project: ZooKeeper Issue Type: Bug Affects Versions: 3.6.3 Reporter: lujingyu Attachments: C1ZK1-1.log, C1ZK2.log, C1ZK3.log, C1ZK4.log, fuzzlog.txt, zookeeper--server-C1ZK1.txt, zookeeper--server-C1ZK2.txt, zookeeper--server-C1ZK3.txt, zookeeper--server-C1ZK4.txt, zookeeper--server-C1ZK5.txt
The Time line of the bug triggered: # A cluster with five nodes: zk1, zk2, zk3, zk4, zk5. And zk3 is the leader. # client1 creates a znode "/bug" ; # client1 creates a znode "delete"; # then, zk5 crashes(a follower, but it likely to be the server which connects to client1, because the fuzzlog file shows that before client1 sets znode "/bug" to nice , client1 seems to have been briefly disconnected from the cluster); # then, client1 reconnect to the cluster, and sets znode "/bug" 's value to nice; # client1 reads znode "/bug", get the value "nice"; # client1 deletes znode "/delete"; # client1 creates ephemeral znode /eph; # after that, zk4 crashes(important, it is the server which connect with client1); # then, client2 connects to the cluster, and it can read the "/eph" node which should be deleted after client1 leaves. ********************************************************************************* The acutal testing scenario is as following: The cluster has five nodes: C1ZK1(172.30.0.2), C1ZK2(172.30.0.3), C1ZK3(172.30.0.4), C1ZK4(172.30.0.5), C1ZK5(172.30.0.6) # 2025-09-05 19:19:22,788 [ZK1Cli] - INFO - build connection with zookeeper (Client1 builds connection with C1ZK4 [observered by the log]) # 2025-09-05 19:19:23,025 [ZK1Cli] - INFO - created znode /bug hello # 2025-09-05 19:19:23,036 [ZK1Cli] - INFO - created znode /delete hello # 2025-09-05 19:19:22,887 prepare to crash node C1ZK5 before the IO operation did by C1ZK1 (C1ZK1 is going to create file: "/home/zk1/evaluation/zk-3.6.3/zkData/version-2/log.100000001") {code:java} java.io.FileOutputStream.<init>(FileOutputStream.java:213), java.io.FileOutputStream.<init>(FileOutputStream.java:162), org.apache.zookeeper.server.persistence.FileTxnLog.append$$PHOSPHORTAGGED(FileTxnLog.java:287), org.apache.zookeeper.server.persistence.FileTxnSnapLog.append$$PHOSPHORTAGGED(FileTxnSnapLog.java:582), org.apache.zookeeper.server.ZKDatabase.append$$PHOSPHORTAGGED(ZKDatabase.java:641), org.apache.zookeeper.server.SyncRequestProcessor.run$$PHOSPHORTAGGED(SyncRequestProcessor.java:181), org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:0) {code} # 2025-09-05 19:19:23,724 [ZK1Cli] - INFO - set znode /bug nice # 2025-09-05 19:19:23,738 [ZK1Cli] - INFO - read znode /bug:nice # 2025-09-05 19:19:23,758 [ZK1Cli] - INFO - deleted znode /delete # 2025-09-05 19:19:23,770 [ZK1Cli] - INFO - created ephemeral znode /eph ephem # Client1 successfully create "/eph". # 2025-09-05 19:19:26,278 prepare to crash node C1ZK4 {code:java} org.apache.zookeeper.server.quorum.QuorumPacket.serialize$$PHOSPHORTAGGED(QuorumPacket.java:68), org.apache.jute.BinaryOutputArchive.writeRecord$$PHOSPHORTAGGED(BinaryOutputArchive.java:126), org.apache.zookeeper.server.quorum.Learner.writePacketNow$$PHOSPHORTAGGED(Learner.java:194), org.apache.zookeeper.server.quorum.Learner.writePacket$$PHOSPHORTAGGED(Learner.java:186), org.apache.zookeeper.server.quorum.Learner.request$$PHOSPHORTAGGED(Learner.java:254), org.apache.zookeeper.server.quorum.FollowerRequestProcessor.run$$PHOSPHORTAGGED(FollowerRequestProcessor.java:104), org.apache.zookeeper.server.quorum.FollowerRequestProcessor.run(FollowerRequestProcessor.java) {code} # 2025-09-05 19:19:27,124 ZK1Cli2 connect to the cluster and then incorrectly got ephemeral znode /eph created by cli1. # 2025-09-05 19:19:39,760 [ZKChecker] - INFO - server C1ZK3:11181 and server C1ZK1:11181 have different number of znodes:[/zookeeper/quota, /bug, /zookeeper] [/zookeeper/quota, /bug, /eph, /zookeeper] ************************************************************************** The following information is the logs of each server. The transaction 0x100000001 "9/5/25 7:19:22 PM CST session 0x5005888b8400000 cxid 0x0 zxid 0x100000001 createSession 15000" indicates that they create a connection with the client.(C1ZK3 is the leader, so there is no such log.) From the following information, C1ZK5 is the earliest server connecting to the client1. C1ZK4: 2025-09-05 19:19:22,932 [myid:4] - WARN [QuorumPeer[myid=4](plain=[0:0:0:0:0:0:0:0]:11181)(secure=disabled):Follower@170] - Got zxid 0x100000001 expected 0x1 C1ZK1:2025-09-05 19:19:22,889 [myid:1] - WARN [QuorumPeer[myid=1](plain=[0:0:0:0:0:0:0:0]:11181)(secure=disabled):Follower@170] - Got zxid 0x100000001 expected 0x1 C1ZK2:2025-09-05 19:19:22,889 [myid:2] - WARN [QuorumPeer[myid=2](plain=[0:0:0:0:0:0:0:0]:11181)(secure=disabled):Follower@170] - Got zxid 0x100000001 expected 0x1 C1ZK5:2025-09-05 19:19:22,884 [myid:5] - WARN [QuorumPeer[myid=5](plain=[0:0:0:0:0:0:0:0]:11181)(secure=disabled):Follower@170] - Got zxid 0x100000001 expected 0x1 So it is really confused that why only the C1ZK1's node list is different from other servers. C1ZK1 is not the node which is crashed and not the node which is rebooted before Client2 connects to cluster. -- This message was sent by Atlassian Jira (v8.20.10#820010)