[ 
https://issues.apache.org/jira/browse/KAFKA-9848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis resolved KAFKA-9848.
-------------------------------------------
    Resolution: Fixed

> Avoid triggering scheduled rebalance delay when task assignment fails but 
> Connect workers remain in the group
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-9848
>                 URL: https://issues.apache.org/jira/browse/KAFKA-9848
>             Project: Kafka
>          Issue Type: Bug
>          Components: KafkaConnect
>    Affects Versions: 2.3.1, 2.5.0, 2.4.1
>            Reporter: Konstantine Karantasis
>            Assignee: Konstantine Karantasis
>            Priority: Major
>             Fix For: 2.3.2, 2.6.0, 2.4.2, 2.5.1
>
>
> There are cases where a Connect worker does not receive its tasks assignments 
> successfully after a rebalance but will still remain in the group. For 
> example when a SyncGroup response is lost, a worker will not get its expected 
> assignments but will rejoin the group immediately and will trigger another 
> rebalance. 
> With incremental cooperative rebalancing, tasks assignments that are computed 
> and sent by the leader but are not received by any of the members are marked 
> as lost assignments in the subsequent rebalance. The presence of lost 
> assignments activates the scheduled rebalance delay (property) and the 
> missing tasks are not assigned until this delay expires.
> This situation can be improved in two cases: 
> a) When it's the leader that failed to receive the new assignments from the 
> broker coordinator (for example if the SyncGroup request or response was 
> lost). If this worker remains the leader of the group in the subsequent 
> rebalance round, it can detect that the previous assignment was not 
> successfully applied by checking what's the expected generation.
> b) If one or more regular members did not receive their assignments 
> successfully, but have joined the latest round of rebalancing, they can be 
> assigned the tasks that remain unassigned from the previous assignment 
> immediately without these tasks being marked as lost. The leader can detect 
> that by checking that some tasks seem lost since the previous assignment but 
> also the number of workers is unchanged between the two rounds of 
> rebalancing. In this case, the leader can go ahead and assign the missing 
> tasks as new tasks immediately.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to