[ https://issues.apache.org/jira/browse/KAFKA-4634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15823187#comment-15823187 ]
Maharajan Shunmuga Sundaram commented on KAFKA-4634: ---------------------------------------------------- I do see following issues in ez6 (Node id : 211) server log [2017-01-13 01:05:28,418] INFO Closing socket connection to /10.254.2.29. (kafka.network.Processor) [2017-01-13 01:05:28,507] INFO re-registering broker info in ZK for broker 211 (kafka.server.KafkaHealthcheck) [2017-01-13 01:05:28,537] INFO conflict in /brokers/ids/211 data: {"jmx_port":-1,"timestamp":"1484287528509","host":"ez6","version":1,"port":9092} stored data: {"jmx_port":-1,"timestamp":"1484287051977","host":"ez6","version":1,"port":9092} (kafka.utils.ZkUtils$) [2017-01-13 01:05:28,562] INFO I wrote this conflicted ephemeral node [{"jmx_port":-1,"timestamp":"1484287528509","host":"ez6","version":1,"port":9092}] at /brokers/ids/211 a while back in a different session, hence I will backoff for this node to be deleted by Zookeeper and retry (kafka.utils.ZkUtils$) to [2017-01-13 01:35:26,146] INFO I wrote this conflicted ephemeral node [{"jmx_port":-1,"timestamp":"1484287528509","host":"ez6","version":1,"port":9092}] at /brokers/ids/211 a while back in a different session, hence I will backoff for this node to be deleted by Zookeeper and retry (kafka.utils.ZkUtils$) [2017-01-13 01:35:32,151] INFO conflict in /brokers/ids/211 data: {"jmx_port":-1,"timestamp":"1484287528509","host":"ez6","version":1,"port":9092} stored data: {"jmx_port":-1,"timestamp":"1484287051977","host":"ez6","version":1,"port":9092} (kafka.utils.ZkUtils$) [2017-01-13 01:35:32,157] INFO I wrote this conflicted ephemeral node [{"jmx_port":-1,"timestamp":"1484287528509","host":"ez6","version":1,"port":9092}] at /brokers/ids/211 a while back in a different session, hence I will backoff for this node to be deleted by Zookeeper and retry (kafka.utils.ZkUtils$) [2017-01-13 01:35:51,958] INFO [ReplicaFetcherManager on broker 211] Removed fetcher for partitions [Tracing,0] (kafka.server.ReplicaFetcherManager) [2017-01-13 01:35:52,049] INFO Closing socket connection to /10.254.2.19. (kafka.network.Processor) [2017-01-13 01:35:52,078] INFO Registered broker 211 at path /brokers/ids/211 with address ez6:9092. (kafka.utils.ZkUtils$) [2017-01-13 01:35:52,078] INFO done re-registering broker (kafka.server.KafkaHealthcheck) [2017-01-13 01:35:52,078] INFO Subscribing to /brokers/topics path to watch for new topics (kafka.server.KafkaHealthcheck) [2017-01-13 01:35:52,089] INFO New leader is 113 (kafka.server.ZookeeperLeaderElector$LeaderChangeListener) Almost 30 secs, server tried to connect zookeeper resulted conflicts until it succeeds. I brought this in case it may hint any problem. > Issue of one kafka brokers not listed in zookeeper > -------------------------------------------------- > > Key: KAFKA-4634 > URL: https://issues.apache.org/jira/browse/KAFKA-4634 > Project: Kafka > Issue Type: Bug > Components: core > Affects Versions: 0.8.2.1 > Environment: Zookeeper version: 3.4.6-1569965, built on 02/20/2014 > 09:09 GMT > kafka_2.10-0.8.2.1 > Reporter: Maharajan Shunmuga Sundaram > > Hi, > We have incident that one of the 10 brokers not listed in brokers list of > zookeeper. > This is verified by running following command > >> echo dump | nc cz2 2181 > SessionTracker dump: > Session Sets (4): > 0 expire at Fri Jan 13 22:32:14 EST 2017: > 0 expire at Fri Jan 13 22:32:16 EST 2017: > 7 expire at Fri Jan 13 22:32:18 EST 2017: > 0x259968e41e30000 > 0x35996670d5d0001 > 0x35996670d5d0000 > 0x159966708470004 > 0x159966e47760000 > 0x159966708470003 > 0x2599672df260000 > 3 expire at Fri Jan 13 22:32:20 EST 2017: > 0x159968e41dd0000 > 0x259966708550001 > 0x259966708550000 > ephemeral nodes dump: > Sessions with Ephemerals (9): > 0x259966708550000: > /brokers/ids/112 > 0x259968e41e30000: > /brokers/ids/213 > 0x159968e41dd0000: > /brokers/ids/19 > 0x159966708470003: > /brokers/ids/110 > 0x35996670d5d0000: > /brokers/ids/113 > /controller > 0x259966708550001: > /brokers/ids/111 > 0x159966708470004: > /brokers/ids/212 > 0x2599672df260000: > /brokers/ids/29 > 0x35996670d5d0001: > /brokers/ids/210 > ------ > There are 10 sessions, but only 9 sessions are listed with brokers. > Broker with id 211 is not listed. Session 0x159966e47760000 is not shown with > broker id 211. > In the broker side log, I do see it is connected > >> zgrep "0x159966e47760000" *log* > > zk.log:[2017-01-13 01:05:28,513] INFO Session establishment complete on > server cz1/10.254.2.19:2181, sessionid = 0x159966e47760000, negotiated > timeout = 6000 (org.apache.zookeeper.ClientCnxn) > zk.log:[2017-01-13 01:35:38,163] INFO Unable to read additional data from > server sessionid 0x159966e47760000, likely server has closed socket, closing > socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn) > zk.log:[2017-01-13 01:35:39,101] WARN Session 0x159966e47760000 for server > null, unexpected error, closing socket connection and attempting reconnect > (org.apache.zookeeper.ClientCnxn) > zk.log:[2017-01-13 01:35:40,121] INFO Unable to read additional data from > server sessionid 0x159966e47760000, likely server has closed socket, closing > socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn) > zk.log:[2017-01-13 01:35:41,770] WARN Session 0x159966e47760000 for server > null, unexpected error, closing socket connection and attempting reconnect > (org.apache.zookeeper.ClientCnxn) > zk.log:[2017-01-13 01:35:42,439] WARN Session 0x159966e47760000 for server > null, unexpected error, closing socket connection and attempting reconnect > (org.apache.zookeeper.ClientCnxn) > zk.log:[2017-01-13 01:35:43,235] INFO Unable to read additional data from > server sessionid 0x159966e47760000, likely server has closed socket, closing > socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn) > zk.log:[2017-01-13 01:35:44,950] WARN Session 0x159966e47760000 for server > null, unexpected error, closing socket connection and attempting reconnect > (org.apache.zookeeper.ClientCnxn) > zk.log:[2017-01-13 01:35:45,837] INFO Unable to read additional data from > server sessionid 0x159966e47760000, likely server has closed socket, closing > socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn) > . > . > . > . > zk.log:[2017-01-13 01:40:14,818] INFO Unable to read additional data from > server sessionid 0x159966e47760000, likely server has closed socket, closing > socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn) > zk.log:[2017-01-13 01:40:15,916] WARN Session 0x159966e47760000 for server > null, unexpected error, closing socket connection and attempting reconnect > (org.apache.zookeeper.ClientCnxn) > zk.log:[2017-01-13 01:40:19,692] INFO Client session timed out, have not > heard from server in 3676ms for sessionid 0x159966e47760000, closing socket > connection and attempting reconnect (org.apache.zookeeper.ClientCnxn) > zk.log:[2017-01-13 01:40:20,632] INFO Unable to read additional data from > server sessionid 0x159966e47760000, likely server has closed socket, closing > socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn) > zk.log:[2017-01-13 01:40:20,814] WARN Session 0x159966e47760000 for server > null, unexpected error, closing socket connection and attempting reconnect > (org.apache.zookeeper.ClientCnxn) > zk.log:[2017-01-13 01:40:22,089] WARN Session 0x159966e47760000 for server > null, unexpected error, closing socket connection and attempting reconnect > (org.apache.zookeeper.ClientCnxn) > zk.log:[2017-01-13 01:40:22,562] INFO Session establishment complete on > server cz2/10.254.2.29:2181, sessionid = 0x159966e47760000, negotiated > timeout = 6000 (org.apache.zookeeper.ClientCnxn) > After several tries, broker connected with zookeeper cz2:2181. I am not sure > how to debug this issue. It would be helpful if someone provides way to debug > this issue. > Regards, > Maharajan S -- This message was sent by Atlassian JIRA (v6.3.4#6332)