1. What does 'zookeeper state changed (Expired)' mean? 2. Has anyone seen issues like this before? Where zookeeper connections are flaky enough to cause leader elections?
It means zookeeper expired the session. The most common reason for this is client side GC (in your case, client is the Kafka broker) though if your zookeeper setup is flaky, this might even point to a server side I/O bottleneck or GC. Could you inspect the broker and zookeeper's gc logs? On Thu, Mar 20, 2014 at 3:26 PM, Andrew Otto <o...@wikimedia.org> wrote: > Hm, just saw something a little fishy. > > (In the following logs, analytics1021 (id 21) and analytics1022 (id 22) > are Brokers and analytics1023,analytics1024,analytics1025 are Zookeepers.) > > At 2014-03-20 21:12:26, analytics1021 lost its connection to zookeeper. It > reconnected to analytics1023, but as it was doing so, it logged a > 'zookeeper state changed (Expired)' message, which apparently caused the > newly opened socket to close again. Am I reading that right? > > In the time that it took analytics1021 to finally re-establish a zookeeper > connection, analytics1022 had become the leader for all partitions. I can > rebalance the leaders, but I'd prefer if this didn't happen in the first > place. > > 1. What does 'zookeeper state changed (Expired)' mean? > 2. Has anyone seen issues like this before? Where zookeeper connections > are flaky enough to cause leader elections? > > Thanks! > -Andrew Otto > > > > [2014-03-20 21:12:26,427] 685698013 > [main-SendThread(analytics1025.eqiad.wmnet:2181)] INFO > org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard > from server in 10067ms for sessionid 0x1425d903a3f1f89, closing socket > connection and attempting reconnect > [2014-03-20 21:12:26,528] 685698114 [main-EventThread] INFO > org.I0Itec.zkclient.ZkClient - zookeeper state changed (Disconnected) > [2014-03-20 21:12:27,252] 685698838 > [main-SendThread(analytics1025.eqiad.wmnet:2181)] INFO > org.apache.zookeeper.ClientCnxn - Opening socket connection to server > analytics1023.eqiad.wmnet/10.64.5.15:2181 > [2014-03-20 21:12:27,271] 685698857 > [main-SendThread(analytics1023.eqiad.wmnet:2181)] INFO > org.apache.zookeeper.ClientCnxn - Socket connection established to > analytics1023.eqiad.wmnet/10.64.5.15:2181, initiating session > [2014-03-20 21:12:27,273] 685698859 [main-EventThread] INFO > org.I0Itec.zkclient.ZkClient - zookeeper state changed (Expired) > [2014-03-20 21:12:27,273] 685698859 > [main-SendThread(analytics1023.eqiad.wmnet:2181)] INFO > org.apache.zookeeper.ClientCnxn - Unable to reconnect to ZooKeeper service, > session 0x1425d903a3f1f89 has expired, closing socket connection > [2014-03-20 21:12:27,273] 685698859 [main-EventThread] INFO > org.apache.zookeeper.ZooKeeper - Initiating client connection, > connectString=analytics1023.eqiad.wmnet,analytics1024.eqiad.wmnet,analytics1025.eqiad.wmnet/kafka/eqiad > sessionTimeout=6000 watcher=org.I0Itec.zkclient.ZkClient@63203b59 > [2014-03-20 21:12:27,290] 685698876 > [ZkClient-EventThread-24-analytics1023.eqiad.wmnet,analytics1024.eqiad.wmnet,analytics1025.eqiad.wmnet/kafka/eqiad] > INFO kafka.server.KafkaZooKeeper - re-registering broker info in ZK for > broker 21 > [2014-03-20 21:12:27,290] 685698876 [main-EventThread] INFO > org.apache.zookeeper.ClientCnxn - EventThread shut down > [2014-03-20 21:12:27,290] 685698876 [main-SendThread()] INFO > org.apache.zookeeper.ClientCnxn - Opening socket connection to server > analytics1023.eqiad.wmnet/10.64.5.15:2181 > [2014-03-20 21:12:27,292] 685698878 > [main-SendThread(analytics1023.eqiad.wmnet:2181)] INFO > org.apache.zookeeper.ClientCnxn - Socket connection established to > analytics1023.eqiad.wmnet/10.64.5.15:2181, initiating session > [2014-03-20 21:12:27,313] 685698899 > [main-SendThread(analytics1023.eqiad.wmnet:2181)] INFO > org.apache.zookeeper.ClientCnxn - Session establishment complete on server > analytics1023.eqiad.wmnet/10.64.5.15:2181, sessionid = > 0xff429b4201dd2352, negotiated timeout = 6000 > [2014-03-20 21:12:27,314] 685698900 [main-EventThread] INFO > org.I0Itec.zkclient.ZkClient - zookeeper state changed (SyncConnected) > [2014-03-20 21:12:27,324] 685698910 > [ZkClient-EventThread-24-analytics1023.eqiad.wmnet,analytics1024.eqiad.wmnet,analytics1025.eqiad.wmnet/kafka/eqiad] > INFO kafka.server.KafkaZooKeeper - done re-registering broker > [2014-03-20 21:12:27,324] 685698910 > [ZkClient-EventThread-24-analytics1023.eqiad.wmnet,analytics1024.eqiad.wmnet,analytics1025.eqiad.wmnet/kafka/eqiad] > INFO kafka.server.KafkaZooKeeper - Subscribing to /brokers/topics path to > watch for new topics > [2014-03-20 21:12:27,370] 685698956 > [ZkClient-EventThread-24-analytics1023.eqiad.wmnet,analytics1024.eqiad.wmnet,analytics1025.eqiad.wmnet/kafka/eqiad] > INFO kafka.server.ZookeeperLeaderElector$LeaderChangeListener - New leader > is 22 > > >