[ https://issues.apache.org/jira/browse/KAFKA-1857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14274620#comment-14274620 ]
Neha Narkhede commented on KAFKA-1857: -------------------------------------- You might find this useful - http://kafka.apache.org/contact.html > Kafka Broker ids are removed ( with zookeeper , Storm ) > ------------------------------------------------------- > > Key: KAFKA-1857 > URL: https://issues.apache.org/jira/browse/KAFKA-1857 > Project: Kafka > Issue Type: Bug > Components: consumer, controller > Affects Versions: 0.8.1 > Environment: Ubuntu , With Storm-kafka and zookeeeper 3.4.6 > Reporter: Yoonhyeok Kim > > Hi, > I am using kind of Real-time analytics system with > zookeeper, Storm & Kafka. > -versions > Storm : storm-corer-0.9.2 > Kafka 0.8.1 (3 brokers) > storm-kafka : 0.9.2 > zookeeper 3.4.6 (standalone) > But this problem occurs when I use pre-versions as well. > - exceptions > EndOfStreamException, > java.nio.channels.CancelledKeyException, > org.apache.zookeeper.KeeperException$BadVersionException > ----------------------- > When I use kafka spout with storm , sometimes there was zookeeper logs like > (zookeeper.out) > {code} > 2015-01-10 19:19:00,836 [myid:] - WARN > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@357] - caught end of > stream exception > EndOfStreamException: Unable to read additional data from client sessionid > 0x14ab82c142b0658, likely client has closed socket > at > org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) > at > org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) > at java.lang.Thread.run(Thread.java:619) > {code} > still, zookeeper is working well, and storm-kafka looks fine , transfers data > rightly. > But as time goes by, those kind of Error keep occurs and then I saw different > logs like... > {code} > 2015-01-10 23:22:11,022 [myid:] - INFO > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed > socket connection for client /70.7.12.38:48504 which had sessionid > 0x14ab82c142b0644 > 2015-01-10 23:22:11,023 [myid:] - WARN > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@357] - caught end of > stream exception > EndOfStreamException: Unable to read additional data from client sessionid > 0x14ab82c142b001d, likely client has closed socket > at > org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) > at > org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) > at java.lang.Thread.run(Thread.java:619) > 2015-01-10 23:22:11,023 [myid:] - INFO > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed > socket connection for client /70.7.12.38:55885 which had sessionid > 0x14ab82c142b001d > 2015-01-10 23:22:11,023 [myid:] - WARN > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@357] - caught end of > stream exception > EndOfStreamException: Unable to read additional data from client sessionid > 0x14ab82c142b063e, likely client has closed socket > at > org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) > at > org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) > at java.lang.Thread.run(Thread.java:619) > 2015-01-10 23:22:11,026 [myid:] - INFO > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed > socket connection for client /70.7.12.38:48444 which had sessionid > 0x14ab82c142b063e > 2015-01-10 23:22:11,026 [myid:] - WARN > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@357] - caught end of > stream exception > EndOfStreamException: Unable to read additional data from client sessionid > 0x14ab82c142b0639, likely client has closed socket > at > org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) > at > org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) > at java.lang.Thread.run(Thread.java:619) > 2015-01-10 23:22:11,027 [myid:] - INFO > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed > socket connection for client /70.7.12.38:56724 which had sessionid > 0x14ab82c142b0658 > 2015-01-10 23:22:11,431 [myid:] - ERROR [SyncThread:0:NIOServerCnxn@178] - > Unexpected Exception: > java.nio.channels.CancelledKeyException > at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55) > at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59) > at > org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:151) > at > org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1081) > at > org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:170) > at > org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:200) > at > org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:131) > 2015-01-10 23:22:11,432 [myid:] - ERROR [SyncThread:0:NIOServerCnxn@178] - > Unexpected Exception: > java.nio.channels.CancelledKeyException > at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55) > at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59) > at > org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:151) > at > org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1081) > at > org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:170) > at > org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:200) > at > org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:131) > 2015-01-10 23:22:11,432 [myid:] - ERROR [SyncThread:0:NIOServerCnxn@178] - > Unexpected Exception: > java.nio.channels.CancelledKeyException > at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55) > at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59) > at > org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:151) > at > org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1081) > at > org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:170) > at > org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:200) > at > org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:131) > 2015-01-10 23:22:11,433 [myid:] - ERROR [SyncThread:0:NIOServerCnxn@178] - > Unexpected Exception: > java.nio.channels.CancelledKeyException > at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55) > at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59) > at > org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:151) > at > org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1081) > at > org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:404) > at > org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:200) > at > org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:131) > {code} > and it goes like > {code} > 2015-01-10 23:22:11,455 [myid:] - ERROR [SyncThread:0:NIOServerCnxn@178] - > Unexpected Exception: > java.nio.channels.CancelledKeyException > at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55) > at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59) > at > org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:151) > at > org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1081) > at > org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:404) > at > org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:200) > at > org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:131) > 2015-01-10 23:22:11,873 [myid:] - INFO > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - > Accepted socket connection from /70.7.12.38:58008 > 2015-01-10 23:22:11,889 [myid:] - WARN > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@822] - Connection > request from old client /70.7.12.38:58008; will be dropped if server is in > r-o mode > 2015-01-10 23:22:11,889 [myid:] - INFO > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@861] - Client > attempting to renew session 0x14ab82c142b000c at /70.7.12.38:58008 > 2015-01-10 23:22:11,890 [myid:] - INFO > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@610] - Invalid > session 0x14ab82c142b000c for client /70.7.12.38:58008, probably expired > 2015-01-10 23:22:11,890 [myid:] - INFO > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed > socket connection for client /70.7.12.38:58008 which had sessionid > 0x14ab82c142b000c > 2015-01-10 23:22:12,111 [myid:] - INFO > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - > Accepted socket connection from /70.7.12.38:58009 > 2015-01-10 23:22:12,138 [myid:] - WARN > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@822] - Connection > request from old client /70.7.12.38:58009; will be dropped if server is in > r-o mode > 2015-01-10 23:22:12,138 [myid:] - INFO > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@868] - Client > attempting to establish new session at /70.7.12.38:58009 > 2015-01-10 23:22:12,238 [myid:] - INFO > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - > Accepted socket connection from /70.7.12.38:58010 > 2015-01-10 23:22:12,239 [myid:] - INFO > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@861] - Client > attempting to renew session 0x14ab82c142b0644 at /70.7.12.38:58010 > 2015-01-10 23:22:12,240 [myid:] - INFO > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@617] - Established > session 0x14ab82c142b0644 with negotiated timeout 20000 for client > /70.7.12.38:58010 > 2015-01-10 23:22:12,271 [myid:] - INFO [SyncThread:0:ZooKeeperServer@617] - > Established session 0x14ab82c142b0659 with negotiated timeout 6000 for client > /70.7.12.38:58009 > {code} > And finally,,, > out of nowhere, SOMETHING removed kafka brokers (ids) in zookeeper. > There is no longer /brokers/ids in zkCli.sh (before, there was [1,2,3] > three kafka brokers which I made) > This is a huge problem, cause after brokers gone, data transfer stops > immediately. > and zookeeper says like(zookeeper.out) > {code} > 2015-01-10 23:30:56,691 [myid:] - INFO [ProcessThread(sid:0 > cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when > processing sessionid:0x14ab82c142b0659 type:setData cxid:0x65 zxid:0x108720 > txntype:-1 reqpath:n/a Error > Path:/brokers/topics/order-topic/partitions/2/state Error:KeeperErrorCode = > BadVersion for /brokers/topics/order-topic/partitions/2/state > 2015-01-10 23:30:56,692 [myid:] - INFO [ProcessThread(sid:0 > cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when > processing sessionid:0x14ab82c142b0659 type:setData cxid:0x66 zxid:0x108721 > txntype:-1 reqpath:n/a Error > Path:/brokers/topics/ais-topic/partitions/2/state Error:KeeperErrorCode = > BadVersion for /brokers/topics/ais-topic/partitions/2/state > 2015-01-10 23:31:06,690 [myid:] - INFO [ProcessThread(sid:0 > cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when > processing sessionid:0x14ab82c142b0659 type:setData cxid:0x67 zxid:0x10873c > txntype:-1 reqpath:n/a Error > Path:/brokers/topics/order-topic/partitions/2/state Error:KeeperErrorCode = > BadVersion for /brokers/topics/order-topic/partitions/2/state > {code} > and Kafka (server.log) logs like > {code} > [2015-01-12 13:07:46,692] INFO Partition [order-topic,2] on broker 1: Cached > zkVersion [2] not equal to that in zookeeper, skip updating ISR > (kafka.cluster.Partition) > [2015-01-12 13:07:46,692] INFO Partition [ais-topic,2] on broker 1: Shrinking > ISR for partition [ais-topic,2] from 1,2 to 1 (kafka.cluster.Partition) > [2015-01-12 13:07:46,693] ERROR Conditional update of path > /brokers/topics/ais-topic/partitions/2/state with data > {"controller_epoch":1,"leader":1,"version":1,"leader_epoch":1,"isr":[1]} and > expected version 2 failed due to > org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = > BadVersion for /brokers/topics/ais-topic/partitions/2/state > (kafka.utils.ZkUtils$) > [2015-01-12 13:07:46,693] INFO Partition [ais-topic,2] on broker 1: Cached > zkVersion [2] not equal to that in zookeeper, skip updating ISR > (kafka.cluster.Partition) > {code} > - my topic name is ais-topic, order-topic. 2 replicates. > and broker ids are 1,2,3 . > I did nothing on server, since 2015-01-09 18:37. > At that time, I killed two storm topologies and re-submit. > next 2 days, system runs fine. Although > "EndOfStreamException" occured since I started zookeeper, there was no > critical issues within 3days... so I think > there would be another reason. > I must know why and what is removing kafka-broker so that I can avoid that > critical errors. (even the logs are not recognized , I only care about > brokers) > please help me out of this problem any chance. > if I wrote a wrong issue on this dashboard, sorry about that... I am new > here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)