Hello Libo, Which Kafka version are you using? Pre-0.8.1 there is a bug that can cause a registration path to be deleted:
https://issues.apache.org/jira/browse/KAFKA-992 And this has been fixed in 0.8.1 Guozhang On Tue, Feb 11, 2014 at 1:16 PM, Yu, Libo <libo...@citi.com> wrote: > Hi team, > > This is an issue that has frustrated me for quit some time. One of our > clusters has > three hosts. In my startup script, three zookeeper processes are brought > up first followed > by three kafka processes. The problem we have is that after three kafka > processes are up, > only one broker has been registered in zookeeper (In this case, host > three). If I manually > kill the kafka processes on host one and host two and restart them, they > can register > themselves with zookeeper successfully. I've attached logs from host one. > The log indicated > broker 1 was registered at /brokers/ids. When I checked zookeeper, I found > only broker 3 > was registered. It seems there is a race condition. > > > [2014-02-11 15:20:55,266] INFO Session establishment complete on server > cfgtps1q > -phys/HostOne:9181, sessionid = 0x144229beb980000, negotiated timeout = 100 > 00 (org.apache.zookeeper.ClientCnxn) > [2014-02-11 15:20:55,268] INFO zookeeper state changed (SyncConnected) > (org.I0It > ec.zkclient.ZkClient) > [2014-02-11 15:20:55,378] INFO /brokers/ids/1 exists with value { > "host":"cfgtps > 1q-phys.nam.nsroot.net", "jmx_port":9999, "port":11934, "version":1 } > during con > nection loss; this is ok (kafka.utils.ZkUtils$) > [2014-02-11 15:20:55,379] INFO Registered broker 1 at path /brokers/ids/1 > with a > ddress hostone.xxx.xxxxxx.net:11934. (kafka.utils.ZkUtils$) > [2014-02-11 15:20:55,380] INFO [Kafka Server 1], Connecting to ZK: HostOne > :9181, HostTwo:9181, HostThree:9181 (kafka.server.KafkaServer) > [2014-02-11 15:20:55,511] INFO Will not load MX4J, mx4j-tools.jar is not > in the > classpath (kafka.utils.Mx4jLoader$) > [2014-02-11 15:20:55,520] INFO conflict in /controller data: 1 stored > data: 3 (k > afka.utils.ZkUtils$) > [2014-02-11 15:20:55,538] INFO [Kafka Server 1], Started > (kafka.server.KafkaServ > er) > [2014-02-11 15:20:58,015] INFO 1 successfully elected as leader > (kafka.server.Zo > okeeperLeaderElector) > [2014-02-11 15:20:58,605] INFO Accepted socket connection from > /HostThree:52420 (org.apache.zookeeper.server.NIOServerCnxn) > [2014-02-11 15:20:58,609] INFO Client attempting to establish new session > at /HostThree:52420 (org.apache.zookeeper.server.NIOServerCnxn) > [2014-02-11 15:20:58,616] INFO Established session 0x144229beb980001 with > negotiated timeout 10000 for client /HostThree:52420 > (org.apache.zookeeper.server.NIOServerCnxn) > [2014-02-11 15:21:01,064] INFO New leader is 1 > (kafka.server.ZookeeperLeaderElector$LeaderChangeListener) > [2014-02-11 15:21:36,375] INFO Accepted socket connection from > /xx.xx.xxx.xx:54709 (org.apache.zookeeper.server.NIOServerCnxn) > [2014-02-11 15:21:36,378] INFO Client attempting to establish new session > at /xx.xx.xxx.xx:54709 (org.apache.zookeeper.server.NIOServerCnxn) > > Regards, > > Libo > -- -- Guozhang