[ https://issues.apache.org/jira/browse/KAFKA-462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jay Kreps resolved KAFKA-462. ----------------------------- Resolution: Won't Fix > ZK thread crashing doesn't bring down the broker (and doesn't come back up). > ---------------------------------------------------------------------------- > > Key: KAFKA-462 > URL: https://issues.apache.org/jira/browse/KAFKA-462 > Project: Kafka > Issue Type: Bug > Components: core > Affects Versions: 0.7 > Reporter: Matt Jones > > I think the simplest explanation is the traceback. The broker had been up > starting at 2012-07-31 18:45:42,951 (based upon the 'Starting Kafka server' > log entry), and the error was fixed with a restart of the broker at > 2012-08-14 20:59:41,581. > It looks like zookeeper thread crashed, but the broker kept operating as > usual. The expected behavior would be that the zookeeper thread crashing > would cause the whole broker to crash, or the zookeeper thread would start > itself back up. > [2012-08-08 01:25:13,398] 624270894 [main-SendThread(zookeeper001:2181)] INFO > org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard > from server in 8749ms for sessionid 0x138e4edc04c1e50, closing socket > connection and attempting reconnect > [2012-08-08 01:25:15,136] 624272632 [main-EventThread] INFO > org.I0Itec.zkclient.ZkClient - zookeeper state changed (Disconnected) > [2012-08-08 01:25:15,702] 624273198 [main-SendThread(zookeeper001:2181)] > INFO org.apache.zookeeper.ClientCnxn - Opening socket connection to server > zookeeper003/10.125.95.193:2181 > [2012-08-08 01:25:15,704] 624273200 [main-SendThread(zookeeper003:2181)] > INFO org.apache.zookeeper.ClientCnxn - Socket connection established to > zookeeper003/10.125.95.193:2181, initiating session > [2012-08-08 01:25:15,709] 624273205 [main-EventThread] INFO > org.I0Itec.zkclient.ZkClient - zookeeper state changed (Expired) > [2012-08-08 01:25:15,709] 624273205 [main-EventThread] INFO > org.apache.zookeeper.ZooKeeper - Initiating client connection, > connectString=zookeeper001:2181,zookeeper002:2181,zookeeper003:2181 > sessionTimeout=6000 watcher=org.I0Itec.zkclient.ZkClient@26d66426 > [2012-08-08 01:25:21,514] 624279010 [main-SendThread(zookeeper003:2181)] > INFO org.apache.zookeeper.ClientCnxn - Unable to reconnect to ZooKeeper > service, session 0x138e4edc04c1e50 has expired, closing socket connection > [2012-08-08 01:25:47,135] 624304631 [main-EventThread] ERROR > org.apache.zookeeper.ClientCnxn - Error while calling watcher > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:506) > Caused by: org.I0Itec.zkclient.exception.ZkException: Unable to connect to > zookeeper001:2181,zookeeper002:2181,zookeeper003:2181 > Caused by: java.net.UnknownHostException: zookeeper001 > at org.apache.zookeeper.ClientCnxn.<init>(ClientCnxn.java:386) > at org.apache.zookeeper.ClientCnxn.<init>(ClientCnxn.java:331) > at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:377) > [2012-08-08 01:25:48,620] 624306116 [main-EventThread] INFO > org.apache.zookeeper.ClientCnxn - EventThread shut down -- This message was sent by Atlassian JIRA (v6.3.4#6332)