[ 
https://issues.apache.org/jira/browse/KAFKA-17793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai resolved KAFKA-17793.
------------------------------------
    Fix Version/s: 4.0.0
       Resolution: Fixed

> Improve kcontroller robustness against long delays
> --------------------------------------------------
>
>                 Key: KAFKA-17793
>                 URL: https://issues.apache.org/jira/browse/KAFKA-17793
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Colin McCabe
>            Assignee: Colin McCabe
>            Priority: Major
>             Fix For: 4.0.0
>
>
> As described in KIP-500, the Kafka controller monitors the liveness of each 
> broker in the cluster. It gathers this information from heartbeats sent from 
> the brokers themselves.
> In some rare cases, the main controller thread may get blocked for several 
> seconds at a time. In the current code, this will result in the controller 
> being unable to update the last contact times for the brokers during this 
> time.
> This PR changes the controller heartbeat handling to be partially lockless. 
> Specifically, the last contact time for each broker will be updated 
> locklessly prior to the rest of the heartbeat handling. This will ensure that 
> heartbeats always get through.
> Additionally, this PR adds a PeriodicTaskControlManager to better manage 
> periodic tasks. This should help handle the very common pattern where we want 
> to schedule a background task at some frequency. We also want the background 
> task to be immediately rescheduled if there is too much work to be done in 
> one event.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to