kirktrue opened a new pull request, #15723:
URL: https://github.com/apache/kafka/pull/15723

   In some cases, the network layer is _very_ fast and can send out multiple 
requests within the same millisecond timestamp.
   
   The previous logic for tracking inflight status used timestamps: if the 
timestamp from the last received response was less than the timestamp from the 
last sent request, we'd interpret that as having an inflight request. However, 
this approach would incorrectly return `false` from 
`RequestState.requestInFlight()` if the two timestamps were _equal_.
   
   One result of this faulty logic is that in such cases, the consumer would 
accidentally send multiple heartbeat requests to the consumer group 
coordinator. The consumer group coordinator would interpret these requests as 
'join group' requests and create members for each request. Therefore, the 
coordinator was under the false understanding that there were more members in 
the group than there really were. Consequently, if your luck was _really_ bad, 
the coordinator might assign partitions to one of the duplicate members. Those 
partitions would be assigned to a phantom consumer that was not reading any 
data, and this led to flaky tests.
   
   The implementation in `RequestState` has a stupid simple flag that is set in 
`onSendAttempt` and cleared in `onSuccessfulAttempt`, `onFailedAttempt`, and 
`reset`. A new unit test has been added and this has been tested against all of 
the consumer unit and integration tests, and has removed all known occurrences 
of phantom consumer group members in the system tests.
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to