smjn opened a new pull request, #19781: URL: https://github.com/apache/kafka/pull/19781
1. Currently, the code allows for retrying any initializing topics in subsequent heartbeats. This can result in duplicate calls to persister if multiple share consumers join the same group concurrently. Furthermore, only one of these will succeed as the others will have a lower state epoch and will be fenced. 2. The existing change was made in https://github.com/apache/kafka/pull/19603 to allow for retrying initialization of initializing topics, in case the original caller was not able to persist the information in the persister due to a dead broker/timeout. 3. To prevent multiple calls as well as allow for retry we have supplemented the timelinehashmap holding the `ShareGroupStatePartitionMetadataInfo` to also hold the timestamp at which this record gets replayed. a. Now when we get multiple consumers for the same group and topic, only one of them is allowed to make the persister initialize request and this information is added to the map when it is replayed. Thus solving issue 1. b. To allow for retries, if an initializing topic is found with a timestamp which is older than 2*offset_write_commit_ms, that topic will be allowed to be retried. Here too only one consumer would be able to retry thus resolving issue 2 as well. 4. Tests have been added wherever applicable and existing ones updated. 5. No record schema changes are involved. 6. The `ShareGroupStatePartitionMetadataInfo` and `InitMapValue` records have been moved to the `ShareGroup` class for better encapsulation. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org