[ https://issues.apache.org/jira/browse/KAFKA-1387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14151663#comment-14151663 ]
James Lent commented on KAFKA-1387: ----------------------------------- As background we are using ZooKeeper 3.4.5. When trying to come up with a fix for this I did consider limiting the loop to 2 to 3 tries. My concerns with this approach were: # Slow to recover if there are lots of Expire messages tp process and each of these could trigger redundant rebalance events until you get to the last one. # What happens if you don't loop quite long enough? You are again stuck in a bad state when the ephemeral does go away. I also considered trying to access the Session Id and storing that value instead of (or in addition to) the timestamp in the node's data. That appraoch looked difficult to implement, error prone, and had the application doing what I would consider ZooKeeper work. I agree there are a lot of corner cases to consider, but, I think we are going to pursue the approach I outlined above. I would be happy to post the proposed solution for your review, but, again I am not sure about the protocol around patch submission. I would not want this to be mistaken by someone as any kind of offical patch without a lot more review. When working on this appraoch I looked at the curator PersistentEphemeralNode for ideas: https://github.com/bazaarvoice/curator-extensions/blob/master/recipes/src/main/java/com/bazaarvoice/curator/recipes/PersistentEphemeralNode.java This is curator based so done not directly apply to Kafka (yet), but, it also keys off nodeDelete to restore the node. In the end I went with the simple idea that: "If when we process an Expire event the node still exists then ZooKeeper will inform us if that node later goes away." If we can't trust ZooKeeper/ZkClient to do that then ... {noformat} class StateListener() extends IZkStateListener { def handleStateChanged(state: KeeperState) {} def handleNewSession() { if (zkClient.exists(path)) { info("New session started, but, ephemeral %s already/still exists".format(path)) } else { info("New session started, recreate ephemeral node %s".format(path)) recreateNode() } } } {noformat} > Kafka getting stuck creating ephemeral node it has already created when two > zookeeper sessions are established in a very short period of time > --------------------------------------------------------------------------------------------------------------------------------------------- > > Key: KAFKA-1387 > URL: https://issues.apache.org/jira/browse/KAFKA-1387 > Project: Kafka > Issue Type: Bug > Reporter: Fedor Korotkiy > > Kafka broker re-registers itself in zookeeper every time handleNewSession() > callback is invoked. > https://github.com/apache/kafka/blob/0.8.1/core/src/main/scala/kafka/server/KafkaHealthcheck.scala > > Now imagine the following sequence of events. > 1) Zookeeper session reestablishes. handleNewSession() callback is queued by > the zkClient, but not invoked yet. > 2) Zookeeper session reestablishes again, queueing callback second time. > 3) First callback is invoked, creating /broker/[id] ephemeral path. > 4) Second callback is invoked and it tries to create /broker/[id] path using > createEphemeralPathExpectConflictHandleZKBug() function. But the path is > already exists, so createEphemeralPathExpectConflictHandleZKBug() is getting > stuck in the infinite loop. > Seems like controller election code have the same issue. > I'am able to reproduce this issue on the 0.8.1 branch from github using the > following configs. > # zookeeper > tickTime=10 > dataDir=/tmp/zk/ > clientPort=2101 > maxClientCnxns=0 > # kafka > broker.id=1 > log.dir=/tmp/kafka > zookeeper.connect=localhost:2101 > zookeeper.connection.timeout.ms=100 > zookeeper.sessiontimeout.ms=100 > Just start kafka and zookeeper and then pause zookeeper several times using > Ctrl-Z. -- This message was sent by Atlassian JIRA (v6.3.4#6332)