[ 
https://issues.apache.org/jira/browse/KAFKA-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dong Lin updated KAFKA-4443:
----------------------------
    Description: 
Currently in onControllerFailover(), controller will startup 
replicaStatemachine and partitionStateMachine before invoking 
sendUpdateMetadataRequest(controllerContext.liveOrShuttingDownBrokerIds.toSeq). 
However, if a broker starts right after controller election, the 
LeaderAndIsrRequest sent to follower partitions on this broker will all be 
ignored because broker doesn't know the leaders are alive. 

To fix this problem, in onControllerFailover(), controller should send 
UpdateMetadataRequest to brokers after initializeControllerContext() but before 
it starts replicaStatemachine and partitionStateMachine. The first 
MetadatUpdateRequest will include list of live broker. Although it will not 
include partition leader information, it is OK because we will always send 
MetadataUpdateRequest again when we send LeaderAndIsrRequest during 
replicaStateMachine.startup() and partitionStateMachine.startup().

  was:
Currently in onControllerFailover(), controller will startup 
replicaStatemachine and partitionStateMachine before invoking 
sendUpdateMetadataRequest(controllerContext.liveOrShuttingDownBrokerIds.toSeq). 
However, if a broker right after controller election, the LeaderAndIsrRequest 
sent to follower partitions on this broker will all be ignored because broker 
doesn't know the leaders are alive. 

To fix this problem, in onControllerFailover(), controller should send 
UpdateMetadataRequest to brokers after initializeControllerContext() but before 
it starts replicaStatemachine and partitionStateMachine. The first 
MetadatUpdateRequest will include list of live broker. Although it will not 
include partition leader information, it is OK because we will always send 
MetadataUpdateRequest again when we send LeaderAndIsrRequest during 
replicaStateMachine.startup() and partitionStateMachine.startup().


> Controller should send UpdateMetadataRequest prior to LeaderAndIsrRequest 
> during failover
> -----------------------------------------------------------------------------------------
>
>                 Key: KAFKA-4443
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4443
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.10.1.0
>            Reporter: Dong Lin
>            Assignee: Dong Lin
>              Labels: reliability
>             Fix For: 0.10.1.1
>
>
> Currently in onControllerFailover(), controller will startup 
> replicaStatemachine and partitionStateMachine before invoking 
> sendUpdateMetadataRequest(controllerContext.liveOrShuttingDownBrokerIds.toSeq).
>  However, if a broker starts right after controller election, the 
> LeaderAndIsrRequest sent to follower partitions on this broker will all be 
> ignored because broker doesn't know the leaders are alive. 
> To fix this problem, in onControllerFailover(), controller should send 
> UpdateMetadataRequest to brokers after initializeControllerContext() but 
> before it starts replicaStatemachine and partitionStateMachine. The first 
> MetadatUpdateRequest will include list of live broker. Although it will not 
> include partition leader information, it is OK because we will always send 
> MetadataUpdateRequest again when we send LeaderAndIsrRequest during 
> replicaStateMachine.startup() and partitionStateMachine.startup().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to