lujingyu created ZOOKEEPER-4981:
-----------------------------------

             Summary: One node has neither crashed nor restarted again, but its 
information is inconsistent with other nodes
                 Key: ZOOKEEPER-4981
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4981
             Project: ZooKeeper
          Issue Type: Bug
    Affects Versions: 3.6.3
            Reporter: lujingyu
         Attachments: C1ZK1-1.log, C1ZK2.log, C1ZK3.log, C1ZK4.log, 
fuzzlog.txt, zookeeper--server-C1ZK1.txt, zookeeper--server-C1ZK2.txt, 
zookeeper--server-C1ZK3.txt, zookeeper--server-C1ZK4.txt, 
zookeeper--server-C1ZK5.txt

The Time line of the bug triggered:
 # A cluster with five nodes: zk1, zk2, zk3, zk4, zk5. And  zk3 is the leader.
 # client1 creates a znode "/bug" ;
 # client1 creates a znode "delete";
 # then, zk5 crashes(a follower, but it likely to be the server which connects 
to client1, because the fuzzlog file shows that before client1 sets znode 
"/bug" to nice , client1 seems to have been briefly disconnected from the 
cluster);
 # then, client1 reconnect to the cluster, and sets znode "/bug" 's value to 
nice;
 # client1 reads znode "/bug", get the value "nice";
 # client1 deletes znode "/delete";
 # client1 creates ephemeral znode /eph;
 # after that, zk4 crashes(important, it is the server which connect with 
client1);
 # then, client2 connects to the cluster, and it can read the "/eph" node which 
should be deleted after client1 leaves.

*********************************************************************************

The acutal testing scenario is as following:

The cluster has five nodes: C1ZK1(172.30.0.2), C1ZK2(172.30.0.3), 
C1ZK3(172.30.0.4), C1ZK4(172.30.0.5), C1ZK5(172.30.0.6)
 # 2025-09-05 19:19:22,788 [ZK1Cli] - INFO -  build connection with zookeeper 
(Client1 builds connection with C1ZK4 [observered by the log])
 # 2025-09-05 19:19:23,025 [ZK1Cli] - INFO - created znode /bug hello
 # 2025-09-05 19:19:23,036 [ZK1Cli] - INFO - created znode /delete hello
 # 2025-09-05 19:19:22,887 prepare to crash node C1ZK5 before the IO operation 
did by C1ZK1 (C1ZK1 is going to create file: 
"/home/zk1/evaluation/zk-3.6.3/zkData/version-2/log.100000001")
{code:java}
java.io.FileOutputStream.<init>(FileOutputStream.java:213),
java.io.FileOutputStream.<init>(FileOutputStream.java:162), 
org.apache.zookeeper.server.persistence.FileTxnLog.append$$PHOSPHORTAGGED(FileTxnLog.java:287),
    
org.apache.zookeeper.server.persistence.FileTxnSnapLog.append$$PHOSPHORTAGGED(FileTxnSnapLog.java:582),
   
org.apache.zookeeper.server.ZKDatabase.append$$PHOSPHORTAGGED(ZKDatabase.java:641),
  
org.apache.zookeeper.server.SyncRequestProcessor.run$$PHOSPHORTAGGED(SyncRequestProcessor.java:181),
 
org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:0)
 {code}

 # 2025-09-05 19:19:23,724 [ZK1Cli] - INFO - set znode /bug nice
 # 2025-09-05 19:19:23,738 [ZK1Cli] - INFO - read znode /bug:nice
 # 2025-09-05 19:19:23,758 [ZK1Cli] - INFO - deleted znode /delete
 # 2025-09-05 19:19:23,770 [ZK1Cli] - INFO - created ephemeral znode /eph ephem
 # Client1 successfully create "/eph".
 # 2025-09-05 19:19:26,278 prepare to crash node C1ZK4 
{code:java}
org.apache.zookeeper.server.quorum.QuorumPacket.serialize$$PHOSPHORTAGGED(QuorumPacket.java:68),
 
org.apache.jute.BinaryOutputArchive.writeRecord$$PHOSPHORTAGGED(BinaryOutputArchive.java:126),
 
org.apache.zookeeper.server.quorum.Learner.writePacketNow$$PHOSPHORTAGGED(Learner.java:194),
 
org.apache.zookeeper.server.quorum.Learner.writePacket$$PHOSPHORTAGGED(Learner.java:186),
 
org.apache.zookeeper.server.quorum.Learner.request$$PHOSPHORTAGGED(Learner.java:254),
 
org.apache.zookeeper.server.quorum.FollowerRequestProcessor.run$$PHOSPHORTAGGED(FollowerRequestProcessor.java:104),
 
org.apache.zookeeper.server.quorum.FollowerRequestProcessor.run(FollowerRequestProcessor.java)
 {code}

 # 2025-09-05 19:19:27,124  ZK1Cli2 connect to the cluster and then incorrectly 
got ephemeral znode /eph created by cli1.
 # 2025-09-05 19:19:39,760 [ZKChecker] - INFO - server C1ZK3:11181 and server 
C1ZK1:11181 have different number of znodes:[/zookeeper/quota, /bug, 
/zookeeper]  [/zookeeper/quota, /bug, /eph, /zookeeper] 

**************************************************************************

The following information is the logs of each server. The transaction 
0x100000001  "9/5/25 7:19:22 PM CST session 0x5005888b8400000 cxid 0x0 zxid 
0x100000001 createSession 15000" indicates that they create a connection with 
the client.(C1ZK3 is the leader, so there is no such log.)  From the following 
information, C1ZK5 is the earliest server connecting to the client1.

C1ZK4: 2025-09-05 19:19:22,932 [myid:4] - WARN  
[QuorumPeer[myid=4](plain=[0:0:0:0:0:0:0:0]:11181)(secure=disabled):Follower@170]
 - Got zxid 0x100000001 expected 0x1

C1ZK1:2025-09-05 19:19:22,889 [myid:1] - WARN  
[QuorumPeer[myid=1](plain=[0:0:0:0:0:0:0:0]:11181)(secure=disabled):Follower@170]
 - Got zxid 0x100000001 expected 0x1

C1ZK2:2025-09-05 19:19:22,889 [myid:2] - WARN  
[QuorumPeer[myid=2](plain=[0:0:0:0:0:0:0:0]:11181)(secure=disabled):Follower@170]
 - Got zxid 0x100000001 expected 0x1

C1ZK5:2025-09-05 19:19:22,884 [myid:5] - WARN  
[QuorumPeer[myid=5](plain=[0:0:0:0:0:0:0:0]:11181)(secure=disabled):Follower@170]
 - Got zxid 0x100000001 expected 0x1

 

So it is really confused that why only the C1ZK1's node list is different from 
other servers. C1ZK1 is not the node which is crashed and not the node which is 
rebooted before Client2 connects to cluster. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to