[ https://issues.apache.org/jira/browse/KAFKA-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15111611#comment-15111611 ]
Mayuresh Gharat commented on KAFKA-3126: ---------------------------------------- Can you explain the steps in further detail : 2. Controller B update ISR of partition p from [A, C] to [C] -----> Update in Zookeeper you mean ? 3. Before the LeaderAndIsrRequest reflecting the change in (2) reaches broker C, broker C expands leader and ISR from [A] to [A, C]. -------> Broker C expands ISR in zookeeper ? 4. The ISR change in 3 was propagated to controller B. ------> Do you mean ISR {A,C} propagated to B ? 5. When Broker A actually shuts down, Controller B will see A in the ISR. -------> If {A,C} is propagated to B, why would it only see {A} > Weird behavior in kafkaController on Controlled shutdowns. The leaderAndIsr > in zookeeper is not updated during controlled shutdown. > ----------------------------------------------------------------------------------------------------------------------------------- > > Key: KAFKA-3126 > URL: https://issues.apache.org/jira/browse/KAFKA-3126 > Project: Kafka > Issue Type: Bug > Components: core > Reporter: Mayuresh Gharat > Assignee: Mayuresh Gharat > > Consider Broker B is controller, broker A is undergoing shutdown. > 2016/01/14 19:49:22.884 [KafkaController] [Controller B]: Shutting down > broker A > 2016/01/14 19:49:22.918 [ReplicaStateMachine] [Replica state machine on > controller B]: Invoking state change to OfflineReplica for replicas > [Topic=testTopic1,Partition=1,Replica=A] -------> (1) > 2016/01/14 19:49:22.930 [KafkaController] [Controller B]: New leader and ISR > for partition [testTopic1,1] is {"leader":D,"leader_epoch":1,"isr":[D]} > ------> (2) > 2016/01/14 19:49:23.028 [ReplicaStateMachine] [Replica state machine on > controller B]: Invoking state change to OfflineReplica for replicas > [Topic=testTopic2,Partition=1,Replica=A] -------> (3) > 2016/01/14 19:49:23.032 [KafkaController] [Controller B]: New leader and ISR > for partition [testTopic2,1] is {"leader":C,"leader_epoch":10,"isr":[C]} > -----> (4) > 2016/01/14 19:49:23.996 [KafkaController] [Controller B]: Broker failure > callback for A > 2016/01/14 19:49:23.997 [PartitionStateMachine] [Partition state machine on > Controller B]: Invoking state change to OfflinePartition for partitions > 2016/01/14 19:49:23.998 [ReplicaStateMachine] [Replica state machine on > controller B]: Invoking state change to OfflineReplica for replicas > [Topic=testTopic2,Partition=0,Replica=A], > [Topic=__consumer_offsets,Partition=5,Replica=A], > [Topic=testTopic1,Partition=2,Replica=A], > [Topic=__consumer_offsets,Partition=96,Replica=A], > [Topic=testTopic2,Partition=1,Replica=A], > [Topic=__consumer_offsets,Partition=36,Replica=A], > [Topic=testTopic1,Partition=4,Replica=A], > [Topic=__consumer_offsets,Partition=85,Replica=A], > [Topic=testTopic1,Partition=6,Replica=A], > [Topic=testTopic1,Partition=1,Replica=A] > 2016/01/14 19:49:24.029 [KafkaController] [Controller B]: New leader and ISR > for partition [testTopic2,1] is {"leader":C,"leader_epoch":11,"isr":[C]} > ------> (5) > 2016/01/14 19:49:24.212 [KafkaController] [Controller B]: Cannot remove > replica A from ISR of partition [testTopic1,1] since it is not in the ISR. > Leader = D ; ISR = List(D) ----------> (6) > If after (1) and (2) controller gets rid of the replica A from the ISR in > zookeeper for [testTopic1-1] as displayed in 6), why doesn't it do the same > for [testTopic2-1] as per (5) -- This message was sent by Atlassian JIRA (v6.3.4#6332)