[ https://issues.apache.org/jira/browse/KAFKA-1857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yoonhyeok Kim updated KAFKA-1857: --------------------------------- Description: Hi, I am using kind of Real-time analytics system with zookeeper, Storm & Kafka. -versions Storm : storm-corer-0.9.2 Kafka 0.8.1 (3 brokers) storm-kafka : 0.9.2 zookeeper 3.4.6 (standalone) But this problem occurs when I use pre-versions as well. - exceptions EndOfStreamException, java.nio.channels.CancelledKeyException, org.apache.zookeeper.KeeperException$BadVersionException ----------------------- When I use kafka spout with storm , sometimes there was zookeeper logs like (zookeeper.out) {quote} 2015-01-10 19:19:00,836 [myid:] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@357] - caught end of stream exception EndOfStreamException: Unable to read additional data from client sessionid 0x14ab82c142b0658, likely client has closed socket at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) at java.lang.Thread.run(Thread.java:619) {quote} still, zookeeper is working well, and storm-kafka looks fine , transfers data rightly. But as time goes by, those kind of Error keep occurs and then I saw different logs like... {quote} 2015-01-10 23:22:11,022 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed socket connection for client /70.7.12.38:48504 which had sessionid 0x14ab82c142b0644 2015-01-10 23:22:11,023 [myid:] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@357] - caught end of stream exception EndOfStreamException: Unable to read additional data from client sessionid 0x14ab82c142b001d, likely client has closed socket at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) at java.lang.Thread.run(Thread.java:619) 2015-01-10 23:22:11,023 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed socket connection for client /70.7.12.38:55885 which had sessionid 0x14ab82c142b001d 2015-01-10 23:22:11,023 [myid:] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@357] - caught end of stream exception EndOfStreamException: Unable to read additional data from client sessionid 0x14ab82c142b063e, likely client has closed socket at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) at java.lang.Thread.run(Thread.java:619) 2015-01-10 23:22:11,026 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed socket connection for client /70.7.12.38:48444 which had sessionid 0x14ab82c142b063e 2015-01-10 23:22:11,026 [myid:] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@357] - caught end of stream exception EndOfStreamException: Unable to read additional data from client sessionid 0x14ab82c142b0639, likely client has closed socket at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) at java.lang.Thread.run(Thread.java:619) 2015-01-10 23:22:11,027 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed socket connection for client /70.7.12.38:56724 which had sessionid 0x14ab82c142b0658 2015-01-10 23:22:11,431 [myid:] - ERROR [SyncThread:0:NIOServerCnxn@178] - Unexpected Exception: java.nio.channels.CancelledKeyException at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55) at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59) at org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:151) at org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1081) at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:170) at org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:200) at org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:131) 2015-01-10 23:22:11,432 [myid:] - ERROR [SyncThread:0:NIOServerCnxn@178] - Unexpected Exception: java.nio.channels.CancelledKeyException at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55) at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59) at org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:151) at org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1081) at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:170) at org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:200) at org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:131) 2015-01-10 23:22:11,432 [myid:] - ERROR [SyncThread:0:NIOServerCnxn@178] - Unexpected Exception: java.nio.channels.CancelledKeyException at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55) at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59) at org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:151) at org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1081) at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:170) at org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:200) at org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:131) 2015-01-10 23:22:11,433 [myid:] - ERROR [SyncThread:0:NIOServerCnxn@178] - Unexpected Exception: java.nio.channels.CancelledKeyException at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55) at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59) at org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:151) at org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1081) at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:404) at org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:200) at org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:131) {quote} and it goes like {quote} 2015-01-10 23:22:11,455 [myid:] - ERROR [SyncThread:0:NIOServerCnxn@178] - Unexpected Exception: java.nio.channels.CancelledKeyException at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55) at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59) at org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:151) at org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1081) at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:404) at org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:200) at org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:131) 2015-01-10 23:22:11,873 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /70.7.12.38:58008 2015-01-10 23:22:11,889 [myid:] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@822] - Connection request from old client /70.7.12.38:58008; will be dropped if server is in r-o mode 2015-01-10 23:22:11,889 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@861] - Client attempting to renew session 0x14ab82c142b000c at /70.7.12.38:58008 2015-01-10 23:22:11,890 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@610] - Invalid session 0x14ab82c142b000c for client /70.7.12.38:58008, probably expired 2015-01-10 23:22:11,890 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed socket connection for client /70.7.12.38:58008 which had sessionid 0x14ab82c142b000c 2015-01-10 23:22:12,111 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /70.7.12.38:58009 2015-01-10 23:22:12,138 [myid:] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@822] - Connection request from old client /70.7.12.38:58009; will be dropped if server is in r-o mode 2015-01-10 23:22:12,138 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@868] - Client attempting to establish new session at /70.7.12.38:58009 2015-01-10 23:22:12,238 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /70.7.12.38:58010 2015-01-10 23:22:12,239 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@861] - Client attempting to renew session 0x14ab82c142b0644 at /70.7.12.38:58010 2015-01-10 23:22:12,240 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@617] - Established session 0x14ab82c142b0644 with negotiated timeout 20000 for client /70.7.12.38:58010 2015-01-10 23:22:12,271 [myid:] - INFO [SyncThread:0:ZooKeeperServer@617] - Established session 0x14ab82c142b0659 with negotiated timeout 6000 for client /70.7.12.38:58009 {quote} And finally,,, out of nowhere, SOMETHING removed kafka brokers (ids) in zookeeper. There is no longer /brokers/ids in zkCli.sh (before, there was [1,2,3] three kafka brokers which I made) This is a huge problem, cause after brokers gone, data transfer stops immediately. and zookeeper says like(zookeeper.out) {quote} 2015-01-10 23:30:56,691 [myid:] - INFO [ProcessThread(sid:0 cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when processing sessionid:0x14ab82c142b0659 type:setData cxid:0x65 zxid:0x108720 txntype:-1 reqpath:n/a Error Path:/brokers/topics/order-topic/partitions/2/state Error:KeeperErrorCode = BadVersion for /brokers/topics/order-topic/partitions/2/state 2015-01-10 23:30:56,692 [myid:] - INFO [ProcessThread(sid:0 cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when processing sessionid:0x14ab82c142b0659 type:setData cxid:0x66 zxid:0x108721 txntype:-1 reqpath:n/a Error Path:/brokers/topics/ais-topic/partitions/2/state Error:KeeperErrorCode = BadVersion for /brokers/topics/ais-topic/partitions/2/state 2015-01-10 23:31:06,690 [myid:] - INFO [ProcessThread(sid:0 cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when processing sessionid:0x14ab82c142b0659 type:setData cxid:0x67 zxid:0x10873c txntype:-1 reqpath:n/a Error Path:/brokers/topics/order-topic/partitions/2/state Error:KeeperErrorCode = BadVersion for /brokers/topics/order-topic/partitions/2/state {quote} and Kafka (server.log) logs like {quote} [2015-01-12 13:07:46,692] INFO Partition [order-topic,2] on broker 1: Cached zkVersion [2] not equal to that in zookeeper, skip updating ISR (kafka.cluster.Partition) [2015-01-12 13:07:46,692] INFO Partition [ais-topic,2] on broker 1: Shrinking ISR for partition [ais-topic,2] from 1,2 to 1 (kafka.cluster.Partition) [2015-01-12 13:07:46,693] ERROR Conditional update of path /brokers/topics/ais-topic/partitions/2/state with data {"controller_epoch":1,"leader":1,"version":1,"leader_epoch":1,"isr":[1]} and expected version 2 failed due to org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /brokers/topics/ais-topic/partitions/2/state (kafka.utils.ZkUtils$) [2015-01-12 13:07:46,693] INFO Partition [ais-topic,2] on broker 1: Cached zkVersion [2] not equal to that in zookeeper, skip updating ISR (kafka.cluster.Partition) {quote} - my topic name is ais-topic, order-topic. 2 replicates. and broker ids are 1,2,3 . I did nothing on server, since 2015-01-09 18:37. At that time, I killed two storm topologies and re-submit. next 2 days, system runs fine. Although "EndOfStreamException" occured since I started zookeeper, there was no critical issues within 3days... so I think there would be another reason. I must know why and what is removing kafka-broker so that I can avoid that critical errors. (even the logs are not recognized , I only care about brokers) please help me out of this problem any chance. if I wrote a wrong issue on this dashboard, sorry about that... I am new here. was: Hi, I am using kind of Real-time analytics system with zookeeper, Storm & Kafka. versions Storm : storm-corer-0.9.2 Kafka 0.8.1 (3 brokers) storm-kafka : 0.9.2 zookeeper 3.4.6 (standalone) But this problem occurs when I use pre-versions as well. When I use kafka spout with storm , somtimes there were zookeeper logs like (zookeeper.out) {quote} 2015-01-10 19:19:00,836 [myid:] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@357] - caught end of stream exception EndOfStreamException: Unable to read additional data from client sessionid 0x14ab82c142b0658, likely client has closed socket at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) at java.lang.Thread.run(Thread.java:619) {quote} still, zookeeper is working well, and storm-kafka looks fine , transfers data rightly. But as time goes by, those kind of Error keep occurs and then I saw different logs like... {quote} 2015-01-10 23:22:11,022 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed socket connection for client /70.7.12.38:48504 which had sessionid 0x14ab82c142b0644 2015-01-10 23:22:11,023 [myid:] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@357] - caught end of stream exception EndOfStreamException: Unable to read additional data from client sessionid 0x14ab82c142b001d, likely client has closed socket at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) at java.lang.Thread.run(Thread.java:619) 2015-01-10 23:22:11,023 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed socket connection for client /70.7.12.38:55885 which had sessionid 0x14ab82c142b001d 2015-01-10 23:22:11,023 [myid:] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@357] - caught end of stream exception EndOfStreamException: Unable to read additional data from client sessionid 0x14ab82c142b063e, likely client has closed socket at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) at java.lang.Thread.run(Thread.java:619) 2015-01-10 23:22:11,026 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed socket connection for client /70.7.12.38:48444 which had sessionid 0x14ab82c142b063e 2015-01-10 23:22:11,026 [myid:] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@357] - caught end of stream exception EndOfStreamException: Unable to read additional data from client sessionid 0x14ab82c142b0639, likely client has closed socket at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) at java.lang.Thread.run(Thread.java:619) 2015-01-10 23:22:11,026 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed socket connection for client /70.7.12.38:48380 which had sessionid 0x14ab82c142b0639 2015-01-10 23:22:11,027 [myid:] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@357] - caught end of stream exception EndOfStreamException: Unable to read additional data from client sessionid 0x14ab82c142b0658, likely client has closed socket at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) at java.lang.Thread.run(Thread.java:619) 2015-01-10 23:22:11,027 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed socket connection for client /70.7.12.38:56724 which had sessionid 0x14ab82c142b0658 2015-01-10 23:22:11,431 [myid:] - ERROR [SyncThread:0:NIOServerCnxn@178] - Unexpected Exception: java.nio.channels.CancelledKeyException at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55) at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59) at org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:151) at org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1081) at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:170) at org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:200) at org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:131) 2015-01-10 23:22:11,432 [myid:] - ERROR [SyncThread:0:NIOServerCnxn@178] - Unexpected Exception: java.nio.channels.CancelledKeyException at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55) at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59) at org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:151) at org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1081) at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:170) at org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:200) at org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:131) 2015-01-10 23:22:11,432 [myid:] - ERROR [SyncThread:0:NIOServerCnxn@178] - Unexpected Exception: java.nio.channels.CancelledKeyException at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55) at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59) at org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:151) at org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1081) at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:170) at org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:200) at org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:131) 2015-01-10 23:22:11,433 [myid:] - ERROR [SyncThread:0:NIOServerCnxn@178] - Unexpected Exception: java.nio.channels.CancelledKeyException at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55) at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59) at org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:151) at org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1081) at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:404) at org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:200) at org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:131) {quote} and it goes like {quote} 2015-01-10 23:22:11,455 [myid:] - ERROR [SyncThread:0:NIOServerCnxn@178] - Unexpected Exception: java.nio.channels.CancelledKeyException at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55) at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59) at org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:151) at org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1081) at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:404) at org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:200) at org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:131) 2015-01-10 23:22:11,873 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /70.7.12.38:58008 2015-01-10 23:22:11,889 [myid:] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@822] - Connection request from old client /70.7.12.38:58008; will be dropped if server is in r-o mode 2015-01-10 23:22:11,889 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@861] - Client attempting to renew session 0x14ab82c142b000c at /70.7.12.38:58008 2015-01-10 23:22:11,890 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@610] - Invalid session 0x14ab82c142b000c for client /70.7.12.38:58008, probably expired 2015-01-10 23:22:11,890 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed socket connection for client /70.7.12.38:58008 which had sessionid 0x14ab82c142b000c 2015-01-10 23:22:12,111 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /70.7.12.38:58009 2015-01-10 23:22:12,138 [myid:] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@822] - Connection request from old client /70.7.12.38:58009; will be dropped if server is in r-o mode 2015-01-10 23:22:12,138 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@868] - Client attempting to establish new session at /70.7.12.38:58009 2015-01-10 23:22:12,238 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /70.7.12.38:58010 2015-01-10 23:22:12,239 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@861] - Client attempting to renew session 0x14ab82c142b0644 at /70.7.12.38:58010 2015-01-10 23:22:12,240 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@617] - Established session 0x14ab82c142b0644 with negotiated timeout 20000 for client /70.7.12.38:58010 2015-01-10 23:22:12,271 [myid:] - INFO [SyncThread:0:ZooKeeperServer@617] - Established session 0x14ab82c142b0659 with negotiated timeout 6000 for client /70.7.12.38:58009 {quote} And finally,,, out of nowhere, SOMETHING removed kafka brokers (ids) in zookeeper. There is no longer /brokers/ids in zkCli.sh (before, there was [1,2,3] three kafka brokers which I made) This is a huge problem, cause after brokers gone, data transfer stops immediately. and zookeeper says like(zookeeper.out) {quote} 2015-01-10 23:30:56,691 [myid:] - INFO [ProcessThread(sid:0 cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when processing sessionid:0x14ab82c142b0659 type:setData cxid:0x65 zxid:0x108720 txntype:-1 reqpath:n/a Error Path:/brokers/topics/order-topic/partitions/2/state Error:KeeperErrorCode = BadVersion for /brokers/topics/order-topic/partitions/2/state 2015-01-10 23:30:56,692 [myid:] - INFO [ProcessThread(sid:0 cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when processing sessionid:0x14ab82c142b0659 type:setData cxid:0x66 zxid:0x108721 txntype:-1 reqpath:n/a Error Path:/brokers/topics/ais-topic/partitions/2/state Error:KeeperErrorCode = BadVersion for /brokers/topics/ais-topic/partitions/2/state 2015-01-10 23:31:06,690 [myid:] - INFO [ProcessThread(sid:0 cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when processing sessionid:0x14ab82c142b0659 type:setData cxid:0x67 zxid:0x10873c txntype:-1 reqpath:n/a Error Path:/brokers/topics/order-topic/partitions/2/state Error:KeeperErrorCode = BadVersion for /brokers/topics/order-topic/partitions/2/state {quote} and Kafka (server.log) logs like {quote} [2015-01-12 13:07:46,692] INFO Partition [order-topic,2] on broker 1: Cached zkVersion [2] not equal to that in zookeeper, skip updating ISR (kafka.cluster.Partition) [2015-01-12 13:07:46,692] INFO Partition [ais-topic,2] on broker 1: Shrinking ISR for partition [ais-topic,2] from 1,2 to 1 (kafka.cluster.Partition) [2015-01-12 13:07:46,693] ERROR Conditional update of path /brokers/topics/ais-topic/partitions/2/state with data {"controller_epoch":1,"leader":1,"version":1,"leader_epoch":1,"isr":[1]} and expected version 2 failed due to org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /brokers/topics/ais-topic/partitions/2/state (kafka.utils.ZkUtils$) [2015-01-12 13:07:46,693] INFO Partition [ais-topic,2] on broker 1: Cached zkVersion [2] not equal to that in zookeeper, skip updating ISR (kafka.cluster.Partition) {quote} - my topic name is ais-topic, order-topic. 2 replicates. and broker ids are 1,2,3 . I did nothing on server, since 2015-01-09 18:37. At that time, I killed two storm topologies and re-submit. next 2 days, system runs fine. Although "EndOfStreamException" occured since I started zookeeper, there was no critical issues within 3days... so I think there would be another reason. I must know why and what is removing kafka-broker so that I can avoid that critical errors. (even the logs are not recognized , I only care about brokers) please help me out of this problem any chance. if I wrote a wrong issue on this dashboard, sorry about that... I am new here. > Kafka Broker ids are removed ( with zookeeper , Storm ) > ------------------------------------------------------- > > Key: KAFKA-1857 > URL: https://issues.apache.org/jira/browse/KAFKA-1857 > Project: Kafka > Issue Type: Bug > Components: consumer, controller > Affects Versions: 0.8.1 > Environment: Ubuntu , With Storm-kafka and zookeeeper 3.4.6 > Reporter: Yoonhyeok Kim > Assignee: Neha Narkhede > > Hi, > I am using kind of Real-time analytics system with > zookeeper, Storm & Kafka. > -versions > Storm : storm-corer-0.9.2 > Kafka 0.8.1 (3 brokers) > storm-kafka : 0.9.2 > zookeeper 3.4.6 (standalone) > But this problem occurs when I use pre-versions as well. > - exceptions > EndOfStreamException, > java.nio.channels.CancelledKeyException, > org.apache.zookeeper.KeeperException$BadVersionException > ----------------------- > When I use kafka spout with storm , sometimes there was zookeeper logs like > (zookeeper.out) > {quote} > 2015-01-10 19:19:00,836 [myid:] - WARN > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@357] - caught end of > stream exception > EndOfStreamException: Unable to read additional data from client sessionid > 0x14ab82c142b0658, likely client has closed socket > at > org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) > at > org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) > at java.lang.Thread.run(Thread.java:619) > {quote} > still, zookeeper is working well, and storm-kafka looks fine , transfers data > rightly. > But as time goes by, those kind of Error keep occurs and then I saw different > logs like... > {quote} > 2015-01-10 23:22:11,022 [myid:] - INFO > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed > socket connection for client /70.7.12.38:48504 which had sessionid > 0x14ab82c142b0644 > 2015-01-10 23:22:11,023 [myid:] - WARN > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@357] - caught end of > stream exception > EndOfStreamException: Unable to read additional data from client sessionid > 0x14ab82c142b001d, likely client has closed socket > at > org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) > at > org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) > at java.lang.Thread.run(Thread.java:619) > 2015-01-10 23:22:11,023 [myid:] - INFO > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed > socket connection for client /70.7.12.38:55885 which had sessionid > 0x14ab82c142b001d > 2015-01-10 23:22:11,023 [myid:] - WARN > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@357] - caught end of > stream exception > EndOfStreamException: Unable to read additional data from client sessionid > 0x14ab82c142b063e, likely client has closed socket > at > org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) > at > org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) > at java.lang.Thread.run(Thread.java:619) > 2015-01-10 23:22:11,026 [myid:] - INFO > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed > socket connection for client /70.7.12.38:48444 which had sessionid > 0x14ab82c142b063e > 2015-01-10 23:22:11,026 [myid:] - WARN > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@357] - caught end of > stream exception > EndOfStreamException: Unable to read additional data from client sessionid > 0x14ab82c142b0639, likely client has closed socket > at > org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) > at > org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) > at java.lang.Thread.run(Thread.java:619) > 2015-01-10 23:22:11,027 [myid:] - INFO > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed > socket connection for client /70.7.12.38:56724 which had sessionid > 0x14ab82c142b0658 > 2015-01-10 23:22:11,431 [myid:] - ERROR [SyncThread:0:NIOServerCnxn@178] - > Unexpected Exception: > java.nio.channels.CancelledKeyException > at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55) > at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59) > at > org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:151) > at > org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1081) > at > org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:170) > at > org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:200) > at > org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:131) > 2015-01-10 23:22:11,432 [myid:] - ERROR [SyncThread:0:NIOServerCnxn@178] - > Unexpected Exception: > java.nio.channels.CancelledKeyException > at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55) > at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59) > at > org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:151) > at > org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1081) > at > org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:170) > at > org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:200) > at > org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:131) > 2015-01-10 23:22:11,432 [myid:] - ERROR [SyncThread:0:NIOServerCnxn@178] - > Unexpected Exception: > java.nio.channels.CancelledKeyException > at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55) > at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59) > at > org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:151) > at > org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1081) > at > org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:170) > at > org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:200) > at > org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:131) > 2015-01-10 23:22:11,433 [myid:] - ERROR [SyncThread:0:NIOServerCnxn@178] - > Unexpected Exception: > java.nio.channels.CancelledKeyException > at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55) > at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59) > at > org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:151) > at > org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1081) > at > org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:404) > at > org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:200) > at > org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:131) > {quote} > and it goes like > {quote} > 2015-01-10 23:22:11,455 [myid:] - ERROR [SyncThread:0:NIOServerCnxn@178] - > Unexpected Exception: > java.nio.channels.CancelledKeyException > at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55) > at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59) > at > org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:151) > at > org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1081) > at > org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:404) > at > org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:200) > at > org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:131) > 2015-01-10 23:22:11,873 [myid:] - INFO > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - > Accepted socket connection from /70.7.12.38:58008 > 2015-01-10 23:22:11,889 [myid:] - WARN > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@822] - Connection > request from old client /70.7.12.38:58008; will be dropped if server is in > r-o mode > 2015-01-10 23:22:11,889 [myid:] - INFO > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@861] - Client > attempting to renew session 0x14ab82c142b000c at /70.7.12.38:58008 > 2015-01-10 23:22:11,890 [myid:] - INFO > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@610] - Invalid > session 0x14ab82c142b000c for client /70.7.12.38:58008, probably expired > 2015-01-10 23:22:11,890 [myid:] - INFO > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed > socket connection for client /70.7.12.38:58008 which had sessionid > 0x14ab82c142b000c > 2015-01-10 23:22:12,111 [myid:] - INFO > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - > Accepted socket connection from /70.7.12.38:58009 > 2015-01-10 23:22:12,138 [myid:] - WARN > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@822] - Connection > request from old client /70.7.12.38:58009; will be dropped if server is in > r-o mode > 2015-01-10 23:22:12,138 [myid:] - INFO > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@868] - Client > attempting to establish new session at /70.7.12.38:58009 > 2015-01-10 23:22:12,238 [myid:] - INFO > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - > Accepted socket connection from /70.7.12.38:58010 > 2015-01-10 23:22:12,239 [myid:] - INFO > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@861] - Client > attempting to renew session 0x14ab82c142b0644 at /70.7.12.38:58010 > 2015-01-10 23:22:12,240 [myid:] - INFO > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@617] - Established > session 0x14ab82c142b0644 with negotiated timeout 20000 for client > /70.7.12.38:58010 > 2015-01-10 23:22:12,271 [myid:] - INFO [SyncThread:0:ZooKeeperServer@617] - > Established session 0x14ab82c142b0659 with negotiated timeout 6000 for client > /70.7.12.38:58009 > {quote} > And finally,,, > out of nowhere, SOMETHING removed kafka brokers (ids) in zookeeper. > There is no longer /brokers/ids in zkCli.sh (before, there was [1,2,3] > three kafka brokers which I made) > This is a huge problem, cause after brokers gone, data transfer stops > immediately. > and zookeeper says like(zookeeper.out) > {quote} > 2015-01-10 23:30:56,691 [myid:] - INFO [ProcessThread(sid:0 > cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when > processing sessionid:0x14ab82c142b0659 type:setData cxid:0x65 zxid:0x108720 > txntype:-1 reqpath:n/a Error > Path:/brokers/topics/order-topic/partitions/2/state Error:KeeperErrorCode = > BadVersion for /brokers/topics/order-topic/partitions/2/state > 2015-01-10 23:30:56,692 [myid:] - INFO [ProcessThread(sid:0 > cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when > processing sessionid:0x14ab82c142b0659 type:setData cxid:0x66 zxid:0x108721 > txntype:-1 reqpath:n/a Error > Path:/brokers/topics/ais-topic/partitions/2/state Error:KeeperErrorCode = > BadVersion for /brokers/topics/ais-topic/partitions/2/state > 2015-01-10 23:31:06,690 [myid:] - INFO [ProcessThread(sid:0 > cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when > processing sessionid:0x14ab82c142b0659 type:setData cxid:0x67 zxid:0x10873c > txntype:-1 reqpath:n/a Error > Path:/brokers/topics/order-topic/partitions/2/state Error:KeeperErrorCode = > BadVersion for /brokers/topics/order-topic/partitions/2/state > {quote} > and Kafka (server.log) logs like > {quote} > [2015-01-12 13:07:46,692] INFO Partition [order-topic,2] on broker 1: Cached > zkVersion [2] not equal to that in zookeeper, skip updating ISR > (kafka.cluster.Partition) > [2015-01-12 13:07:46,692] INFO Partition [ais-topic,2] on broker 1: Shrinking > ISR for partition [ais-topic,2] from 1,2 to 1 (kafka.cluster.Partition) > [2015-01-12 13:07:46,693] ERROR Conditional update of path > /brokers/topics/ais-topic/partitions/2/state with data > {"controller_epoch":1,"leader":1,"version":1,"leader_epoch":1,"isr":[1]} and > expected version 2 failed due to > org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = > BadVersion for /brokers/topics/ais-topic/partitions/2/state > (kafka.utils.ZkUtils$) > [2015-01-12 13:07:46,693] INFO Partition [ais-topic,2] on broker 1: Cached > zkVersion [2] not equal to that in zookeeper, skip updating ISR > (kafka.cluster.Partition) > {quote} > - my topic name is ais-topic, order-topic. 2 replicates. > and broker ids are 1,2,3 . > I did nothing on server, since 2015-01-09 18:37. > At that time, I killed two storm topologies and re-submit. > next 2 days, system runs fine. Although > "EndOfStreamException" occured since I started zookeeper, there was no > critical issues within 3days... so I think > there would be another reason. > I must know why and what is removing kafka-broker so that I can avoid that > critical errors. (even the logs are not recognized , I only care about > brokers) > please help me out of this problem any chance. > if I wrote a wrong issue on this dashboard, sorry about that... I am new > here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)