This is what Kubernetes says me ... Name: zookeeper Namespace: default Labels: <none> Selector: name=zookeeper Type: ClusterIP IP: 10.0.0.184 Port: zookeeper 2181/TCP Endpoints: 172.17.0.4:2181 Session Affinity: None
So the address is always 10.0.0.184. From the log I understand that the creash is released to the zookeeper pod I closed ... so kafka server lost connection to it. Starting from there they should be the attempts to connect to the new zookeeper that is up and running with same IP address as the previous one. Paolo PatiernoSenior Software Engineer (IoT) @ Red Hat Microsoft MVP on Windows Embedded & IoTMicrosoft Azure Advisor Twitter : @ppatierno Linkedin : paolopatierno Blog : DevExperience > Date: Tue, 10 May 2016 17:49:59 +0200 > From: ra...@gruchalski.com > To: users@kafka.apache.org > Subject: Re: Zookeeper dies ... Kafka server unable to connect > > Are you sure you’re getting the same IP address? > Regarding zookeeper connection being closed, is kubernetes doing a soft > shutdown of your container? If so, zookeeper is asked politely to stop. > – > Best regards, > Radek Gruchalski > radek@gruchalski.commailto:ra...@gruchalski.com > de.linkedin.com/in/radgruchalski > +4917685656526 > > Confidentiality: > This communication is intended for the above-named person and may be > confidential and/or legally privileged. > If it has come to you in error you must take no action based on it, nor must > you copy or show it to anyone; please delete/destroy and inform the sender > immediately. > > On May 10, 2016 at 5:47:24 PM, Paolo Patierno (ppatie...@live.com) wrote: > > Hi all, > > experiencing with Kafka on Kubernetes I have the following error on Kafka > server reconnection ... > > A cluster with one zookeeper and two kafka server ... I turn off the > zookeeper pod but kubernetes restart it and guaratees the same IP address for > it but the kafka server starts to retry connection failing with following > trace : > > [2016-05-10 15:40:55,046] WARN Session 0x1549b308dd20002 for server > 10.0.0.184/10.0.0.184:2181, unexpected error, closing socket connection and > attempting reconnect (org.apache.zookeeper.ClientCnxn) > java.io.IOException: Connection reset by peer > at sun.nio.ch.FileDispatcherImpl.read0(Native Method) > at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) > at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) > at sun.nio.ch.IOUtil.read(IOUtil.java:192) > at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380) > at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:68) > > at > org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:366) > > at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) > [2016-05-10 15:40:55,149] INFO zookeeper state changed (Disconnected) > (org.I0Itec.zkclient.ZkClient) > [2016-05-10 15:40:57,093] INFO Opening socket connection to server > 10.0.0.184/10.0.0.184:2181. Will not attempt to authenticate using SASL > (unknown error) (org.apache.zookeeper.ClientCnxn) > [2016-05-10 15:40:57,093] INFO Socket connection established to > 10.0.0.184/10.0.0.184:2181, initiating session > (org.apache.zookeeper.ClientCnxn) > [2016-05-10 15:40:57,158] INFO Unable to read additional data from server > sessionid 0x1549b308dd20002, likely server has closed socket, closing socket > connection and attempting reconnect (org.apache.zookeeper.ClientCnxn) > [2016-05-10 15:40:58,936] INFO Opening socket connection to server > 10.0.0.184/10.0.0.184:2181. Will not attempt to authenticate using SASL > (unknown error) (org.apache.zookeeper.ClientCnxn) > [2016-05-10 15:40:58,936] INFO Socket connection established to > 10.0.0.184/10.0.0.184:2181, initiating session > (org.apache.zookeeper.ClientCnxn) > [2016-05-10 15:40:58,937] INFO Unable to read additional data from server > sessionid 0x1549b308dd20002, likely server has closed socket, closing socket > connection and attempting reconnect (org.apache.zookeeper.ClientCnxn) > [2016-05-10 15:41:00,845] INFO Opening socket connection to server > 10.0.0.184/10.0.0.184:2181. Will not attempt to authenticate using SASL > (unknown error) (org.apache.zookeeper.ClientCnxn) > [2016-05-10 15:41:00,845] INFO Socket connection established to > 10.0.0.184/10.0.0.184:2181, initiating session > (org.apache.zookeeper.ClientCnxn) > [2016-05-10 15:41:00,846] INFO Unable to read additional data from server > sessionid 0x1549b308dd20002, likely server has closed socket, closing socket > connection and attempting reconnect (org.apache.zookeeper.ClientCnxn) > [2016-05-10 15:41:02,071] INFO Opening socket connection to server > 10.0.0.184/10.0.0.184:2181. Will not attempt to authenticate using SASL > (unknown error) (org.apache.zookeeper.ClientCnxn) > [2016-05-10 15:41:02,071] INFO Socket connection established to > 10.0.0.184/10.0.0.184:2181, initiating session > (org.apache.zookeeper.ClientCnxn) > [2016-05-10 15:41:02,072] INFO Unable to read additional data from server > sessionid 0x1549b308dd20002, likely server has closed socket, closing socket > connection and attempting reconnect (org.apache.zookeeper.ClientCnxn) > [2016-05-10 15:41:03,336] INFO Opening socket connection to server > 10.0.0.184/10.0.0.184:2181. Will not attempt to authenticate using SASL > (unknown error) (org.apache.zookeeper.ClientCnxn) > [2016-05-10 15:41:03,336] INFO Socket connection established to > 10.0.0.184/10.0.0.184:2181, initiating session > (org.apache.zookeeper.ClientCnxn) > [2016-05-10 15:41:03,337] INFO Unable to read additional data from server > sessionid 0x1549b308dd20002, likely server has closed socket, closing socket > connection and attempting reconnect (org.apache.zookeeper.ClientCnxn) > [2016-05-10 15:41:05,121] INFO Opening socket connection to server > 10.0.0.184/10.0.0.184:2181. Will not attempt to authenticate using SASL > (unknown error) (org.apache.zookeeper.ClientCnxn) > [2016-05-10 15:41:05,121] INFO Socket connection established to > 10.0.0.184/10.0.0.184:2181, initiating session > (org.apache.zookeeper.ClientCnxn) > [2016-05-10 15:41:05,122] INFO Unable to read additional data from server > sessionid 0x1549b308dd20002, likely server has closed socket, closing socket > connection and attempting reconnect (org.apache.zookeeper.ClientCnxn) > > You can see when the first zookeeper dies and connection is lost ... and all > the retries by kafka server in order to connect to the new one (same IP, same > port). > > Why the zookeeper server closes the connection (I can see FIN ACK frames on > Wireshark) > > Thanks, > Paolo. > > Paolo PatiernoSenior Software Engineer (IoT) @ Red Hat > Microsoft MVP on Windows Embedded & IoTMicrosoft Azure Advisor > Twitter : @ppatierno > Linkedin : paolopatierno > Blog : DevExperience