[ https://issues.apache.org/jira/browse/KAFKA-4277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15768759#comment-15768759 ]
Jun Rao commented on KAFKA-4277: -------------------------------- [~fpj], in KAFKA-1387, you had the following comment: "If a client has received a session expiration event, it means that the leader has expired the session and has broadcast the closeSession event to the followers. If the same client creates a new session successfully, then the server it connects to must have applied the previous closeSession, which deletes the ephemeral znodes, because ZK guarantees that txns are totally ordered. Consequently, the client shouldn't observe an ephemeral from an old session of its own. Note that another client could still observe the ephemeral znode after the session expiration if it is connected to a server that is a bit behind, but that's fine." However, a while back in ZK's mailing list, you also had the following comment (http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results). "Unless we expire a session and delete ephemerals atomically, there are only two options I see: 1- Delete right before expiring the session 2- Delete right after expiring the session Because of timing, we can have the following. With the first, a client might observe the delete before the session actually expires, which violates our contract. With the second, you may observe an ephemeral znode after the session has expired as you have. I would say that the second option is correct as long as the ephemerals are eventually deleted, but it does have the side-effect you're mentioning." Could you clarify whether ZK server guarantees to delete ephemeral nodes before notifying a client about session expiration? > creating ephemeral node already exist > ------------------------------------- > > Key: KAFKA-4277 > URL: https://issues.apache.org/jira/browse/KAFKA-4277 > Project: Kafka > Issue Type: Bug > Affects Versions: 0.10.0.0 > Reporter: Feixiang Yan > > I use zookeeper 3.4.6. > Zookeeper session time out, zkClient try reconnect failed. Then re-establish > the session and re-registering broker info in ZK, throws NODEEXISTS Exception. > I think it is because the ephemeral node which created by old session has > not removed. > I read the > [ZkUtils.scala|https://github.com/apache/kafka/blob/0.8.1/core/src/main/scala/kafka/utils/ZkUtils.scala] > of 0.8.1, createEphemeralPathExpectConflictHandleZKBug try create node in a > while loop until create success. This can solve the issue. But in > [ZkUtils.scala|https://github.com/apache/kafka/blob/0.10.0.1/core/src/main/scala/kafka/utils/ZkUtils.scala] > 0.10.1 the function removed. > {noformat} > [2016-10-07 19:00:32,562] INFO Socket connection established to > 10.191.155.238/10.191.155.238:21819, initiating session > (org.apache.zookeeper.ClientCnxn) > [2016-10-07 19:00:32,563] INFO zookeeper state changed (Expired) > (org.I0Itec.zkclient.ZkClient) > [2016-10-07 19:00:32,564] INFO Unable to reconnect to ZooKeeper service, > session 0x1576b11f9b201bd has expired, closing socket connection > (org.apache.zookeeper.ClientCnxn) > [2016-10-07 19:00:32,564] INFO Initiating client connection, > connectString=10.191.155.237:21819,10.191.155.238:21819,10.191.155.239:21819/cluster2 > sessionTimeout=6000 watcher=org.I0Itec.zkclient.ZkClient@ae71be2 > (org.apache.zookeeper.ZooKeeper) > [2016-10-07 19:00:32,566] INFO Opening socket connection to server > 10.191.155.237/10.191.155.237:21819. Will not attempt to authenticate using > SASL (unknown error) (org.apache.zookeeper.ClientCnxn) > [2016-10-07 19:00:32,566] INFO Socket connection established to > 10.191.155.237/10.191.155.237:21819, initiating session > (org.apache.zookeeper.ClientCnxn) > [2016-10-07 19:00:32,566] INFO EventThread shut down > (org.apache.zookeeper.ClientCnxn) > [2016-10-07 19:00:32,567] INFO Session establishment complete on server > 10.191.155.237/10.191.155.237:21819, sessionid = 0x1579ecd39c20006, > negotiated timeout = 6000 (org.apache.zookeeper.ClientCnxn) > [2016-10-07 19:00:32,567] INFO zookeeper state changed (SyncConnected) > (org.I0Itec.zkclient.ZkClient) > [2016-10-07 19:00:32,608] INFO re-registering broker info in ZK for broker 3 > (kafka.server.KafkaHealthcheck$SessionExpireListener) > [2016-10-07 19:00:32,610] INFO Creating /brokers/ids/3 (is it secure? false) > (kafka.utils.ZKCheckedEphemeral) > [2016-10-07 19:00:32,611] INFO Result of znode creation is: NODEEXISTS > (kafka.utils.ZKCheckedEphemeral) > [2016-10-07 19:00:32,614] ERROR Error handling event ZkEvent[New session > event sent to kafka.server.KafkaHealthcheck$SessionExpireListener@324f1bc] > (org.I0Itec.zkclient.ZkEventThread) > java.lang.RuntimeException: A broker is already registered on the path > /brokers/ids/3. This probably indicates that you either have configured a > brokerid that is already in use, or else you have shutdown this broker and > restarted it faster than the zookeeper timeout so it appears to be > re-registering. > at kafka.utils.ZkUtils.registerBrokerInZk(ZkUtils.scala:305) > at kafka.utils.ZkUtils.registerBrokerInZk(ZkUtils.scala:291) > at kafka.server.KafkaHealthcheck.register(KafkaHealthcheck.scala:70) > at > kafka.server.KafkaHealthcheck$SessionExpireListener.handleNewSession(KafkaHealthcheck.scala:104) > at org.I0Itec.zkclient.ZkClient$6.run(ZkClient.java:735) > at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)