[ https://issues.apache.org/jira/browse/KAFKA-10122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17209685#comment-17209685 ]
Guozhang Wang commented on KAFKA-10122: --------------------------------------- This ticket is resolved as a result of KAFKA-10134: when a join-group request is received the broker would disable hb timing on the client, only when a sync-group request is received the hb timing is resumed. With KAFKA-10134 we would resume heartbeating during the COMPLETING REBALANCE as well so this would no longer be an issue. > Consumer should allow heartbeat during rebalance as well > -------------------------------------------------------- > > Key: KAFKA-10122 > URL: https://issues.apache.org/jira/browse/KAFKA-10122 > Project: Kafka > Issue Type: Improvement > Reporter: Guozhang Wang > Assignee: Guozhang Wang > Priority: Major > > Today we disable heartbeats if the {{state != MemberState.STABLE}}. And if a > rebalance failed we set the state to UNJOINED. In the old API {{poll(long)}} > it is okay since we always try to complete the rebalance successfully within > the same call, so we would not be in UNJOINED or REBALANCING for a very long > time. > But with the new {{poll(Duration)}} we may actually return while we are still > in UNJOINED or REBALANCING and it may take some time (smaller than > max.poll.interval but larger than session.timeout) before the next poll call, > and since heartbeat is disabled during this period of time we could be kicked > by the coordinator. > The proposal I have is > 1) allow heartbeat to be sent during REBALANCING as well. > 2) when join/sync response has retriable error, do not set the state to > UNJOINED but stay with REBALANCING. -- This message was sent by Atlassian Jira (v8.3.4#803005)