[ 
https://issues.apache.org/jira/browse/KAFKA-1407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15743840#comment-15743840
 ] 

Michał Woś commented on KAFKA-1407:
-----------------------------------

[~dmitrybugaychenko] +1
The same issue in 0.8.2.0. Scenario
1. Broker looses session from zookeeper (because some reasons - nvm here)
2. Brokers and zookeeper establish new session and this happens:
{noformat}
2016-12-10 10:28:22,924 INFO org.apache.zookeeper.ClientCnxn: Unable to 
reconnect to ZooKeeper service, session 0x25453ff793e91ed has expired, closing 
socket connection
2016-12-10 10:28:22,924 INFO org.I0Itec.zkclient.ZkClient: zookeeper state 
changed (Expired)
2016-12-10 10:28:22,924 INFO org.apache.zookeeper.ZooKeeper: Initiating client 
connection, connectString=machine_address2, machine_address3, machine_address4 
sessionTimeout=15000 watcher=org.I0Itec.zkclient.ZkClient@145eaa29
2016-12-10 10:28:22,936 INFO org.apache.zookeeper.ClientCnxn: EventThread shut 
down
2016-12-10 10:28:22,938 INFO org.apache.zookeeper.ClientCnxn: Opening socket 
connection to server machine_address1. Will not attempt to authenticate using 
SASL (unknown error)
2016-12-10 10:28:22,940 INFO org.apache.zookeeper.ClientCnxn: Socket connection 
established to machine_address1, initiating session
2016-12-10 10:28:22,947 INFO org.apache.zookeeper.ClientCnxn: Session 
establishment complete on server machine_address1, sessionid = 
0x65453fead67efb4, negotiated timeout = 15000
2016-12-10 10:28:22,947 INFO org.I0Itec.zkclient.ZkClient: zookeeper state 
changed (SyncConnected)

2016-12-10 10:28:23,014 INFO kafka.log.OffsetIndex: Deleting index 
/home/amy/data/kafka/topic_name-0/00000000034872197725.index.deleted
2016-12-10 10:28:23,040 INFO kafka.log.OffsetIndex: Deleting index 
/home/amy/data/kafka/topic_name-0/00000000034871445334.index.deleted
2016-12-10 10:28:23,183 INFO kafka.log.OffsetIndex: Deleting index 
/home/amy/data/kafka/topic_name-0/00000000034882006637.index.deleted
2016-12-10 10:28:23,188 INFO kafka.log.OffsetIndex: Deleting index 
/home/amy/data/kafka/topic_name-0/00000000034875965334.index.deleted

2016-12-10 10:28:23,277 INFO 
kafka.controller.KafkaController$SessionExpirationListener: 
[SessionExpirationListener on 2110], ZK expired; shut down all controller 
components and try to re-elect
2016-12-10 10:28:23,319 ERROR org.I0Itec.zkclient.ZkEventThread: Error handling 
event ZkEvent[New session event sent to 
kafka.controller.KafkaController$SessionExpirationListener@70ee82ce]
java.lang.IllegalStateException: Kafka scheduler has not been started
        at kafka.utils.KafkaScheduler.ensureStarted(KafkaScheduler.scala:114)
        at kafka.utils.KafkaScheduler.shutdown(KafkaScheduler.scala:86)
        at 
kafka.controller.KafkaController.onControllerResignation(KafkaController.scala:350)
        at 
kafka.controller.KafkaController$SessionExpirationListener$$anonfun$handleNewSessionapply$mcZ$sp(KafkaController.scala:1108)
        at 
kafka.controller.KafkaController$SessionExpirationListener$$anonfun$handleNewSessionapply(KafkaController.scala:1107)
        at 
kafka.controller.KafkaController$SessionExpirationListener$$anonfun$handleNewSessionapply(KafkaController.scala:1107)
        at kafka.utils.Utils$.inLock(Utils.scala:561)
        at 
kafka.controller.KafkaController$SessionExpirationListener.handleNewSession(KafkaController.scala:1107)
        at org.I0Itec.zkclient.ZkClientrun(ZkClient.java:472)
        at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71)
2016-12-10 10:28:23,320 INFO kafka.server.KafkaHealthcheck: re-registering 
broker info in ZK for broker 2110
2016-12-10 10:28:23,367 INFO kafka.utils.ZkUtils$: Registered broker 2110 at 
path /brokers/ids/2110 with address machine_address.
2016-12-10 10:28:23,370 INFO kafka.server.KafkaHealthcheck: done re-registering 
broker
2016-12-10 10:28:23,380 INFO kafka.server.KafkaHealthcheck: Subscribing to 
/brokers/topics path to watch for new topics
2016-12-10 10:28:23,456 INFO 
kafka.server.ZookeeperLeaderElector$LeaderChangeListener: New leader is 828
2016-12-10 10:28:24,820 INFO kafka.log.Log: Rolled new log segment for 
'Dispatcher.client_calls-0' in 1 ms.
{noformat}
Meantime, after session is established, my producer gets for each messege:
{noformat}
org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is 
not the leader for that topic-partition
and messages are lost.
{noformat}

> Broker can not return to ISR because of BadVersionException
> -----------------------------------------------------------
>
>                 Key: KAFKA-1407
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1407
>             Project: Kafka
>          Issue Type: Bug
>          Components: controller
>    Affects Versions: 0.8.1
>            Reporter: Dmitry Bugaychenko
>            Assignee: Neha Narkhede
>
> Each morning we found a broker out of ISR at stuck with log full of messages:
> {code}
> INFO   | jvm 1    | 2014/04/21 08:36:21 | [2014-04-21 09:36:21,907] ERROR 
> Conditional update of path /brokers/topics/topic2/partitions/1/state with 
> data 
> {"controller_epoch":46,"leader":2,"version":1,"leader_epoch":38,"isr":[2]} 
> and expected version 53 failed due to 
> org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = 
> BadVersion for /brokers/topics/topic2/partitions/1/state 
> (kafka.utils.ZkUtils$)
> INFO   | jvm 1    | 2014/04/21 08:36:21 | [2014-04-21 09:36:21,907] INFO 
> Partition [topic2,1] on broker 2: Cached zkVersion [53] not equal to that in 
> zookeeper, skip updating ISR (kafka.cluster.Partition)
> {code}
> It seems that it can not recover after short netwrok break down and the only 
> way to return it is restart it using kill -9.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to