[jira] [Resolved] (KAFKA-16106) group size counters do not reflect the actual sizes when operations fail

David Jacot (Jira) Fri, 04 Oct 2024 00:32:07 -0700


     [ 
https://issues.apache.org/jira/browse/KAFKA-16106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


David Jacot resolved KAFKA-16106.
---------------------------------
    Fix Version/s: 4.0.0
         Assignee: Dongnuo Lyu  (was: Jeff Kim)
       Resolution: Fixed

> group size counters do not reflect the actual sizes when operations fail
> ------------------------------------------------------------------------
>
>                 Key: KAFKA-16106
>                 URL: https://issues.apache.org/jira/browse/KAFKA-16106
>             Project: Kafka
>          Issue Type: Sub-task
>            Reporter: Jeff Kim
>            Assignee: Dongnuo Lyu
>            Priority: Major
>             Fix For: 4.0.0
>
>
> An expire-group-metadata operation generates tombstone records, updates the 
> `groups` state and decrements group size counters, then performs a write to 
> the log. If there is a __consumer_offsets partition reassignment, this 
> operation fails. The `groups` state is reverted to an earlier snapshot but 
> classic group size counters are not. This begins an inconsistency between the 
> metrics and the actual groups size. This applies to all unsuccessful write 
> operations that alter the `groups` state.
>  
> The issue is exacerbated because the expire group metadata operation can be 
> retried multiple times until the partition is fully unloaded.
>  
> The solution to this is to make the counters also a timeline data structure 
> (TimelineLong) so that in the event of a failed write operation we revert the 
> counters as well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-16106) group size counters do not reflect the actual sizes when operations fail

Reply via email to