[ https://issues.apache.org/jira/browse/KAFKA-17170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17878804#comment-17878804 ]
Lianet Magrans commented on KAFKA-17170: ---------------------------------------- Hey [~joaopedrofonseca], this was indeed resolved end of last week. I just updated it. But you can find many others under the newbie label. Also maybe helpful to play with the filters by 'component' so you can see the newbie ones under the components you may be more interested in. Thanks for you interest! Please reach out if you have any questions. > Add test to ensure new consumer acks reconciled assignment even if first HB > with ack lost > ----------------------------------------------------------------------------------------- > > Key: KAFKA-17170 > URL: https://issues.apache.org/jira/browse/KAFKA-17170 > Project: Kafka > Issue Type: Task > Components: clients, consumer > Reporter: Lianet Magrans > Assignee: 黃竣陽 > Priority: Minor > Labels: kip-848-client-support, newbie > Fix For: 4.0.0 > > > When a consumer reconciles an assignment, it transitions to ACKNOWLEDGING, so > that a HB is sent on the next manager poll, without waiting for the interval. > The consumer transitions out of this ack state as soon as it sends the > heartbeat, without waiting for a response. This is based on the expectation > that following heartbeats (sent on the interval) will act as ack, including > the set of partitions even in case the first ack is lost. This is the > expected flow: > # complete reconciliation and send HB1 to ack assignment tp0 > # HB1 times out (or fails in any way) => heartbeat request manager resets > the sentFields to null (HeartbeatState.reset() , triggered if the request > fails, or if it gets a response with an Error) > # following HB will include tp0 (and act as ack), because it will notice > that tp0 != null (last value sent) > This seems not to be covered by any test, so we should add a unit test to the > HeartbeatRequestManager, to ensure that the HB generated in step 4 above > includes tp0 as I expect :), considering both cases of error: request fails > (no response) and request gets a response with an Error in it. > This flow is important because if failing to send the reconciled partitions > in a HB, the broker would remain waiting for an ack that the member would > considered it already sent (the broker would wait for the rebalance timeout > before re-assigning those partitions) -- This message was sent by Atlassian Jira (v8.20.10#820010)