[ https://issues.apache.org/jira/browse/KAFKA-9848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Konstantine Karantasis resolved KAFKA-9848. ------------------------------------------- Resolution: Fixed > Avoid triggering scheduled rebalance delay when task assignment fails but > Connect workers remain in the group > ------------------------------------------------------------------------------------------------------------- > > Key: KAFKA-9848 > URL: https://issues.apache.org/jira/browse/KAFKA-9848 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect > Affects Versions: 2.3.1, 2.5.0, 2.4.1 > Reporter: Konstantine Karantasis > Assignee: Konstantine Karantasis > Priority: Major > Fix For: 2.3.2, 2.6.0, 2.4.2, 2.5.1 > > > There are cases where a Connect worker does not receive its tasks assignments > successfully after a rebalance but will still remain in the group. For > example when a SyncGroup response is lost, a worker will not get its expected > assignments but will rejoin the group immediately and will trigger another > rebalance. > With incremental cooperative rebalancing, tasks assignments that are computed > and sent by the leader but are not received by any of the members are marked > as lost assignments in the subsequent rebalance. The presence of lost > assignments activates the scheduled rebalance delay (property) and the > missing tasks are not assigned until this delay expires. > This situation can be improved in two cases: > a) When it's the leader that failed to receive the new assignments from the > broker coordinator (for example if the SyncGroup request or response was > lost). If this worker remains the leader of the group in the subsequent > rebalance round, it can detect that the previous assignment was not > successfully applied by checking what's the expected generation. > b) If one or more regular members did not receive their assignments > successfully, but have joined the latest round of rebalancing, they can be > assigned the tasks that remain unassigned from the previous assignment > immediately without these tasks being marked as lost. The leader can detect > that by checking that some tasks seem lost since the previous assignment but > also the number of workers is unchanged between the two rounds of > rebalancing. In this case, the leader can go ahead and assign the missing > tasks as new tasks immediately. -- This message was sent by Atlassian Jira (v8.3.4#803005)