lucasbru opened a new pull request, #19818: URL: https://github.com/apache/kafka/pull/19818
There is a sequence of interactions with the membership managers of KIP-848, KIP-932, KIP-1071 that can put the member ship manager into JOINING state, but where member epoch is set to -1. This can result in an invalid request being sent, since joining heartbeats should not have member epoch -1. This may lead to the member failing to join. In the case of streams, the group coordinator will return INVALID_REQUEST. This is the sequence triggering the bug, which seems to relatively likely, caused by two heartbeat responses being received after the next one has been sent. `membershipManager.leaveGroup(); -> transitions to LEAVING membershipManager.onHeartbeatRequestGenerated(); -> transitions to UNSUBSCRIBED membershipManager.onHeartbeatSuccess(... with member epoch > 0); -> unblocks the consumer membershipManager.onSubscriptionUpdated(); membershipManager.onConsumerPoll(); -> transitions to JOINING membershipManager.onHeartbeatSuccess(... with member epoch < 0); -> updates the epoch to a negative value -> Now we are in state JOINING with memberEpoch -1, and the next heartbeat we send will be malformed, triggering INVALID_REQUEST` The bug may also be triggered if the `unsubscribe` times out, but this seems more of a corner case. To prevent the bug, we are taking two measures: The likely path to triggering the bug can be prevented by not unblocking an `unsubscribe` call in the consumer when a non-leave-heartbeat epoch is received. Once we have sent out leave group heartbeat, we will ignore all heartbeats, except for those containing memberEpoch < 0. For extra measure, we also prevent the second case (`unsubscribe` timing out). In this case, the consumer gets unblocked before we have received the leave group heartbeat response, and may resubscribe to the group. In this case, we shall just ignore the heartbeat response that contains a member epoch < 0, once it arrives and we have already left the `UNSUBSCRIBED` state. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org