[ 
https://issues.apache.org/jira/browse/KAFKA-3083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao updated KAFKA-3083:
---------------------------
    Description: 
The following sequence can happen.

1. Broker A is the controller and is in the middle of processing a broker 
change event. As part of this process, let's say it's about to shrink the isr 
of a partition.

2. Then broker A's session expires and broker B takes over as the new 
controller. Broker B sends the initial leaderAndIsr request to all brokers.

3. Broker A continues by shrinking the isr of the partition in ZK and sends the 
new leaderAndIsr request to the broker (say C) that leads the partition. Broker 
C will reject this leaderAndIsr since the request comes from a controller with 
an older epoch. Now we could be in a situation that Broker C thinks the isr has 
all replicas, but the isr stored in ZK is different.


  was:
The following sequence can happen.

1. Broker A is the controller and is in the middle of processing a broker 
change event. As part of this process, let's say it's about to shrink the isr 
of a partition.

2. Then broker A's session expires and broker B takes over as the new 
controller. Broker B sends the initial leaderAndIsr request to all brokers.

3. Broker A continues by shrinking the isr of the partition in ZK and sends the 
new leaderAndIsr request to the broker (say C) that leads the partition. Broker 
C will reject this leaderAndIsr since the request comes from a controller with 
an older epoch. Now we could be in a situation that Broker C thinks the isr has 
all replicas, but the isr stored in ZK is different.




1. Originally, broker 12 was the controller with controller epoch 4. It 
received the following broker change event and was in the middle of processing 
this event by selecting new leaders and shrinking ISRs.

2015-12-25 09:10:57,339 INFO kafka.utils.Logging$class:68 
[ZkClient-EventThread-93-ec2-107-20-175-177.compute-1.amazonaws.com:2181,ec2-107-20-175-179.compute-1.amazonaws.com:2181,ec2-107-20-175-226.compute-1.amazonaws.com:2181,ec2-107-20-175-229.compute-1.amazonaws.com:2181,ec2-107-20-175-232.compute-1.amazonaws.com:2181/kskafka/everest]
 [info] [BrokerChangeListener on Controller 12]: Newly added brokers: , deleted 
brokers: 
0,10,56,42,25,20,29,1,33,9,53,41,64,59,27,49,7,39,35,11,55,8,30,19,4,47,68, all 
live brokers: 
5,24,37,52,14,46,57,61,6,60,28,38,70,21,65,13,2,32,34,45,17,22,44,71,54,66,3,48,63,18,50,67,16,31,43,40,26,23,58,36,51,15,62

2. Then broker 12's ZK session expired and broker 30 took over as the 
controller with controller epoch 6.

2015-12-25 09:11:11,012 INFO kafka.utils.Logging$class:68 
[ZkClient-EventThread-93-ec2-107-20-175-177.compute-1.amazonaws.com:2181,ec2-107-20-175-179.compute-1.amazonaws.com:2181,ec2-107-20-175-226.compute-1.amazonaws.com:2181,ec2-107-20-175-229.compute-1.amazonaws.com:2181,ec2-107-20-175-232.compute-1.amazonaws.com:2181/kskafka/everest]
 [info] [Controller 30]: Controller 30 incremented epoch to 6

3. Controller 30 read the current leaderAndIsr for [streaming_client_log,3] 
(with leader epoch 5) from ZK during initialization and sent it to broker 31 
(the leader of streaming_client_log,3) with controller epoch 6

4. Old controller 12 continued from step 1. It shrank the ISR for 
[streaming_client_log,3] and changed leader epoch to 6. 
2015-12-25 09:11:13,274 INFO kafka.utils.Logging$class:68 
[ZkClient-EventThread-93-ec2-107-20-175-177.compute-1.amazonaws.com:2181,ec2-107-20-175-179.compute-1.amazonaws.com:2181,ec2-107-20-175-226.compute-1.amazonaws.com:2181,ec2-107-20-175-229.compute-1.amazonaws.com:2181,ec2-107-20-175-232.compute-1.amazonaws.com:2181/kskafka/everest]
 [info] [Controller 12]: New leader and ISR for partition 
[streaming_client_log,3] is {"leader":31,"leader_epoch":6,"isr":[31]}

5. Old controller 12 sent leaderAndIsr to broker 31, but it's ignored since the 
highest controller epoch on broker 31 is 6, which is higher than the controller 
epoch 4 in leaderAndIsr. 
2015-12-25 09:11:15,484 WARN kafka.utils.Logging$class:83 
[kafka-request-handler-6] [warn] Broker 31 ignoring LeaderAndIsr request from 
controller 12 with correlation id 769 since its controller epoch 4 is old. 
Latest known controller epoch is 6

6. Old controller 12 finally received the ZK session expiration event and 
stopped acting as the controller.



> a soft failure in controller may leader a topic partition in an inconsistent 
> state
> ----------------------------------------------------------------------------------
>
>                 Key: KAFKA-3083
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3083
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.9.0.0
>            Reporter: Jun Rao
>
> The following sequence can happen.
> 1. Broker A is the controller and is in the middle of processing a broker 
> change event. As part of this process, let's say it's about to shrink the isr 
> of a partition.
> 2. Then broker A's session expires and broker B takes over as the new 
> controller. Broker B sends the initial leaderAndIsr request to all brokers.
> 3. Broker A continues by shrinking the isr of the partition in ZK and sends 
> the new leaderAndIsr request to the broker (say C) that leads the partition. 
> Broker C will reject this leaderAndIsr since the request comes from a 
> controller with an older epoch. Now we could be in a situation that Broker C 
> thinks the isr has all replicas, but the isr stored in ZK is different.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to