[jira] [Comment Edited] (KAFKA-6029) Controller should wait for the leader migration to finish before ack a ControlledShutdownRequest

Jason Gustafson (JIRA) Mon, 20 May 2019 16:37:13 -0700


    [ 
https://issues.apache.org/jira/browse/KAFKA-6029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16844388#comment-16844388
 ]


Jason Gustafson edited comment on KAFKA-6029 at 5/20/19 11:36 PM:
------------------------------------------------------------------

I think we can actually resolve this as an unintended benefit of 
[KIP-320|https://cwiki.apache.org/confluence/display/KAFKA/KIP-320%3A+Allow+fetchers+to+detect+and+handle+log+truncation].
 When the controller shrinks the ISR, it bumps the epoch. The bumped epoch 
prevents the shutting down follower from being added back to the ISR. The 
controller may still send a LeaderAndIsr request to the shutting down broker 
with the updated epoch, but the shutting down broker will not restart the 
fetcher.


was (Author: hachikuji):
I think we can actually resolve this as a unintended benefit of 
[KIP-320|https://cwiki.apache.org/confluence/display/KAFKA/KIP-320%3A+Allow+fetchers+to+detect+and+handle+log+truncation].
 When the controller shrinks the ISR, it bumps the epoch. The bumped epoch 
prevents the shutting down follower from being added back to the ISR. The 
controller may still send a LeaderAndIsr request to the shutting down broker 
with the updated epoch, but the shutting down broker will not restart the 
fetcher.

> Controller should wait for the leader migration to finish before ack a 
> ControlledShutdownRequest
> ------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-6029
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6029
>             Project: Kafka
>          Issue Type: Sub-task
>          Components: controller, core
>    Affects Versions: 1.0.0
>            Reporter: Jiangjie Qin
>            Assignee: Zhanxiang (Patrick) Huang
>            Priority: Major
>
> In the controlled shutdown process, the controller will return the 
> ControlledShutdownResponse immediately after the state machine is updated. 
> Because the LeaderAndIsrRequests and UpdateMetadataRequests may not have been 
> successfully processed by the brokers, the leader migration and active ISR 
> shrink may not have done when the shutting down broker proceeds to shut down. 
> This will cause some of the leaders to take up to replica.lag.time.max.ms to 
> kick the broker out of ISR. Meanwhile the produce purgatory size will grow.
> Ideally, the controller should wait until all the LeaderAndIsrRequests and 
> UpdateMetadataRequests has been acked before sending back the 
> ControlledShutdownResponse.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (KAFKA-6029) Controller should wait for the leader migration to finish before ack a ControlledShutdownRequest

Reply via email to