Maharajan Shunmuga Sundaram created KAFKA-4634: --------------------------------------------------
Summary: Issue of one kafka brokers not listed in zookeeper Key: KAFKA-4634 URL: https://issues.apache.org/jira/browse/KAFKA-4634 Project: Kafka Issue Type: Bug Components: core Affects Versions: 0.8.2.1 Reporter: Maharajan Shunmuga Sundaram Hi, We have incident that one of the 10 brokers not listed in brokers list of zookeeper. This is verified by running following command >> echo dump | nc cz2 2181 SessionTracker dump: Session Sets (4): 0 expire at Fri Jan 13 22:32:14 EST 2017: 0 expire at Fri Jan 13 22:32:16 EST 2017: 7 expire at Fri Jan 13 22:32:18 EST 2017: 0x259968e41e30000 0x35996670d5d0001 0x35996670d5d0000 0x159966708470004 0x159966e47760000 0x159966708470003 0x2599672df260000 3 expire at Fri Jan 13 22:32:20 EST 2017: 0x159968e41dd0000 0x259966708550001 0x259966708550000 ephemeral nodes dump: Sessions with Ephemerals (9): 0x259966708550000: /brokers/ids/112 0x259968e41e30000: /brokers/ids/213 0x159968e41dd0000: /brokers/ids/19 0x159966708470003: /brokers/ids/110 0x35996670d5d0000: /brokers/ids/113 /controller 0x259966708550001: /brokers/ids/111 0x159966708470004: /brokers/ids/212 0x2599672df260000: /brokers/ids/29 0x35996670d5d0001: /brokers/ids/210 ------ There are 10 sessions, but only 9 sessions are listed with brokers. Broker with id 211 is not listed. Session 0x159966e47760000 is not shown with broker id 211. In the broker side log, I do see it is connected >> zgrep "0x159966e47760000" *log* zk.log:[2017-01-13 01:05:28,513] INFO Session establishment complete on server cz1/10.254.2.19:2181, sessionid = 0x159966e47760000, negotiated timeout = 6000 (org.apache.zookeeper.ClientCnxn) zk.log:[2017-01-13 01:35:38,163] INFO Unable to read additional data from server sessionid 0x159966e47760000, likely server has closed socket, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn) zk.log:[2017-01-13 01:35:39,101] WARN Session 0x159966e47760000 for server null, unexpected error, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn) zk.log:[2017-01-13 01:35:40,121] INFO Unable to read additional data from server sessionid 0x159966e47760000, likely server has closed socket, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn) zk.log:[2017-01-13 01:35:41,770] WARN Session 0x159966e47760000 for server null, unexpected error, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn) zk.log:[2017-01-13 01:35:42,439] WARN Session 0x159966e47760000 for server null, unexpected error, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn) zk.log:[2017-01-13 01:35:43,235] INFO Unable to read additional data from server sessionid 0x159966e47760000, likely server has closed socket, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn) zk.log:[2017-01-13 01:35:44,950] WARN Session 0x159966e47760000 for server null, unexpected error, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn) zk.log:[2017-01-13 01:35:45,837] INFO Unable to read additional data from server sessionid 0x159966e47760000, likely server has closed socket, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn) . . . . zk.log:[2017-01-13 01:40:14,818] INFO Unable to read additional data from server sessionid 0x159966e47760000, likely server has closed socket, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn) zk.log:[2017-01-13 01:40:15,916] WARN Session 0x159966e47760000 for server null, unexpected error, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn) zk.log:[2017-01-13 01:40:19,692] INFO Client session timed out, have not heard from server in 3676ms for sessionid 0x159966e47760000, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn) zk.log:[2017-01-13 01:40:20,632] INFO Unable to read additional data from server sessionid 0x159966e47760000, likely server has closed socket, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn) zk.log:[2017-01-13 01:40:20,814] WARN Session 0x159966e47760000 for server null, unexpected error, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn) zk.log:[2017-01-13 01:40:22,089] WARN Session 0x159966e47760000 for server null, unexpected error, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn) zk.log:[2017-01-13 01:40:22,562] INFO Session establishment complete on server cz2/10.254.2.29:2181, sessionid = 0x159966e47760000, negotiated timeout = 6000 (org.apache.zookeeper.ClientCnxn) After several tries, broker connected with zookeeper cz2:2181. I am not sure how to debug this issue. It would be helpful if someone provides way to debug this issue. Regards, Maharajan S -- This message was sent by Atlassian JIRA (v6.3.4#6332)