[ https://issues.apache.org/jira/browse/KAFKA-16106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
David Jacot resolved KAFKA-16106. --------------------------------- Fix Version/s: 4.0.0 Assignee: Dongnuo Lyu (was: Jeff Kim) Resolution: Fixed > group size counters do not reflect the actual sizes when operations fail > ------------------------------------------------------------------------ > > Key: KAFKA-16106 > URL: https://issues.apache.org/jira/browse/KAFKA-16106 > Project: Kafka > Issue Type: Sub-task > Reporter: Jeff Kim > Assignee: Dongnuo Lyu > Priority: Major > Fix For: 4.0.0 > > > An expire-group-metadata operation generates tombstone records, updates the > `groups` state and decrements group size counters, then performs a write to > the log. If there is a __consumer_offsets partition reassignment, this > operation fails. The `groups` state is reverted to an earlier snapshot but > classic group size counters are not. This begins an inconsistency between the > metrics and the actual groups size. This applies to all unsuccessful write > operations that alter the `groups` state. > > The issue is exacerbated because the expire group metadata operation can be > retried multiple times until the partition is fully unloaded. > > The solution to this is to make the counters also a timeline data structure > (TimelineLong) so that in the event of a failed write operation we revert the > counters as well. -- This message was sent by Atlassian Jira (v8.20.10#820010)