Yiquan Zhou created KAFKA-4391: ---------------------------------- Summary: On Windows, Kafka server stops with uncaught exception after coming back from sleep Key: KAFKA-4391 URL: https://issues.apache.org/jira/browse/KAFKA-4391 Project: Kafka Issue Type: Bug Affects Versions: 0.9.0.1 Environment: Windows 10, jdk1.8.0_111 Reporter: Yiquan Zhou
Steps to reproduce: 1. start the zookeeper $ bin/zookeeper-server-start.sh config/zookeeper.properties 2. start the Kafka server with the default properties $ bin/kafka-server-start.sh config/server.properties 3. put Windows into sleep mode for 1-2 hours 4. activate Windows again, an exception occurs in Kafka server console and the server is stopped: {code:title=kafka console log} [2016-11-08 21:45:35,185] INFO Client session timed out, have not heard from server in 10081379ms for sessionid 0x1584514da470000, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn) [2016-11-08 21:45:40,698] INFO zookeeper state changed (Disconnected) (org.I0Itec.zkclient.ZkClient) [2016-11-08 21:45:43,029] INFO Opening socket connection to server 127.0.0.1/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn) [2016-11-08 21:45:43,044] INFO Socket connection established to 127.0.0.1/127.0.0.1:2181, initiating session (org.apache.zookeeper.ClientCnxn) [2016-11-08 21:45:43,158] INFO Unable to reconnect to ZooKeeper service, session 0x1584514da470000 has expired, closing socket connection (org.apache.zookeeper.ClientCnxn) [2016-11-08 21:45:43,158] INFO zookeeper state changed (Expired) (org.I0Itec.zkclient.ZkClient) [2016-11-08 21:45:43,236] INFO Initiating client connection, connectString=localhost:2181 sessionTimeout=6000 watcher=org.I0Itec.zkclient.ZkClient@11ca437b (org.apache.zookeeper.ZooKeeper) [2016-11-08 21:45:43,280] INFO EventThread shut down (org.apache.zookeeper.ClientCnxn) log4j:ERROR Failed to rename [/controller.log] to [/controller.log.2016-11-08-18]. [2016-11-08 21:45:43,421] INFO Opening socket connection to server 127.0.0.1/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn) [2016-11-08 21:45:43,483] INFO Socket connection established to 127.0.0.1/127.0.0.1:2181, initiating session (org.apache.zookeeper.ClientCnxn) [2016-11-08 21:45:43,811] INFO Session establishment complete on server 127.0.0.1/127.0.0.1:2181, sessionid = 0x1584514da470001, negotiated timeout = 6000 (org.apache.zookeeper.ClientCnxn) [2016-11-08 21:45:43,827] INFO zookeeper state changed (SyncConnected) (org.I0Itec.zkclient.ZkClient) log4j:ERROR Failed to rename [/server.log] to [/server.log.2016-11-08-18]. [2016-11-08 21:45:43,827] INFO Creating /controller (is it secure? false) (kafka.utils.ZKCheckedEphemeral) [2016-11-08 21:45:44,014] INFO Result of znode creation is: OK (kafka.utils.ZKCheckedEphemeral) [2016-11-08 21:45:44,014] INFO 0 successfully elected as leader (kafka.server.ZookeeperLeaderElector) log4j:ERROR Failed to rename [/state-change.log] to [/state-change.log.2016-11-08-18]. [2016-11-08 21:45:44,421] INFO re-registering broker info in ZK for broker 0 (kafka.server.KafkaHealthcheck) [2016-11-08 21:45:44,436] INFO Creating /brokers/ids/0 (is it secure? false) (kafka.utils.ZKCheckedEphemeral) [2016-11-08 21:45:44,686] INFO Result of znode creation is: OK (kafka.utils.ZKCheckedEphemeral) [2016-11-08 21:45:44,686] INFO Registered broker 0 at path /brokers/ids/0 with addresses: PLAINTEXT -> EndPoint(192.168.0.15,9092,PLAINTEXT) (kafka.utils.ZkUtils) [2016-11-08 21:45:44,686] INFO done re-registering broker (kafka.server.KafkaHealthcheck) [2016-11-08 21:45:44,686] INFO Subscribing to /brokers/topics path to watch for new topics (kafka.server.KafkaHealthcheck) [2016-11-08 21:45:45,046] INFO [ReplicaFetcherManager on broker 0] Removed fetcher for partitions [test,0] (kafka.server.ReplicaFetcherManager) [2016-11-08 21:45:45,061] INFO New leader is 0 (kafka.server.ZookeeperLeaderElector$LeaderChangeListener) [2016-11-08 21:45:47,325] ERROR Uncaught exception in scheduled task 'kafka-recovery-point-checkpoint' (kafka.utils.KafkaScheduler) java.io.IOException: File rename from D:\tmp\kafka-logs\recovery-point-offset-checkpoint.tmp to D:\tmp\kafka-logs\recovery-point-offset-checkpoint failed. at kafka.server.OffsetCheckpoint.write(OffsetCheckpoint.scala:66) at kafka.log.LogManager.kafka$log$LogManager$$checkpointLogsInDir(LogManager.scala:326) at kafka.log.LogManager$$anonfun$checkpointRecoveryPointOffsets$1.apply(LogManager.scala:317) at kafka.log.LogManager$$anonfun$checkpointRecoveryPointOffsets$1.apply(LogManager.scala:317) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108) at kafka.log.LogManager.checkpointRecoveryPointOffsets(LogManager.scala:317) at kafka.log.LogManager$$anonfun$startup$3.apply$mcV$sp(LogManager.scala:201) at kafka.utils.KafkaScheduler$$anonfun$1.apply$mcV$sp(KafkaScheduler.scala:110) at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:60) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) {code} {code:title=zookeeper console log} [2016-11-08 21:45:35,185] WARN caught end of stream exception (org.apache.zookeeper.server.NIOServerCnxn) EndOfStreamException: Unable to read additional data from client sessionid 0x1584514da470000, likely client has closed socket at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) at java.lang.Thread.run(Thread.java:744) [2016-11-08 21:45:35,185] INFO Closed socket connection for client /0:0:0:0:0:0:0:1:50192 which had sessionid 0x1584514da470000 (org.apache.zookeeper.server.NIOServerCnxn) [2016-11-08 21:45:35,372] INFO Expiring session 0x1584514da470000, timeout of 6000ms exceeded (org.apache.zookeeper.server.ZooKeeperServer) [2016-11-08 21:45:35,372] INFO Processed session termination for sessionid: 0x1584514da470000 (org.apache.zookeeper.server.PrepRequestProcessor) [2016-11-08 21:45:40,588] WARN fsync-ing the write ahead log in SyncThread:0 took 5221ms which will adversely effect operation latency. See the ZooKeeper troubleshooting guide (org.apache.zookeeper.server.persistence.FileTxnLog) [2016-11-08 21:45:43,044] INFO Accepted socket connection from /127.0.0.1:50783 (org.apache.zookeeper.server.NIOServerCnxnFactory) [2016-11-08 21:45:43,127] INFO Client attempting to renew session 0x1584514da470000 at /127.0.0.1:50783 (org.apache.zookeeper.server.ZooKeeperServer) [2016-11-08 21:45:43,158] INFO Invalid session 0x1584514da470000 for client /127.0.0.1:50783, probably expired (org.apache.zookeeper.server.ZooKeeperServer) [2016-11-08 21:45:43,158] INFO Closed socket connection for client /127.0.0.1:50783 which had sessionid 0x1584514da470000 (org.apache.zookeeper.server.NIOServerCnxn) [2016-11-08 21:45:43,483] INFO Accepted socket connection from /127.0.0.1:50801 (org.apache.zookeeper.server.NIOServerCnxnFactory) [2016-11-08 21:45:43,514] INFO Client attempting to establish new session at /127.0.0.1:50801 (org.apache.zookeeper.server.ZooKeeperServer) [2016-11-08 21:45:43,811] INFO Established session 0x1584514da470001 with negotiated timeout 6000 for client /127.0.0.1:50801 (org.apache.zookeeper.server.ZooKeeperServer) [2016-11-08 21:45:44,327] INFO Got user-level KeeperException when processing sessionid:0x1584514da470001 type:delete cxid:0x18 zxid:0x15d txntype:-1 reqpath:n/a Error Path:/admin/preferred_replica_election Error:KeeperErrorCode = NoNode for /admin/preferred_replica_election (org.apache.zookeeper.server.PrepRequestProcessor) [2016-11-08 21:45:44,436] INFO Got user-level KeeperException when processing sessionid:0x1584514da470001 type:create cxid:0x19 zxid:0x15e txntype:-1 reqpath:n/a Error Path:/brokers Error:KeeperErrorCode = NodeExists for /brokers (org.apache.zookeeper.server.PrepRequestProcessor) [2016-11-08 21:45:44,452] INFO Got user-level KeeperException when processing sessionid:0x1584514da470001 type:create cxid:0x1a zxid:0x15f txntype:-1 reqpath:n/a Error Path:/brokers/ids Error:KeeperErrorCode = NodeExists for /brokers/ids (org.apache.zookeeper.server.PrepRequestProcessor) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)