[ https://issues.apache.org/jira/browse/KAFKA-14757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17693024#comment-17693024 ]
Siddharth Anand commented on KAFKA-14757: ----------------------------------------- [~ocadaruma] We did not see the rebalance-in-progress exception in our consumer logs. If we were able to safely commit offsets during a rebalance, that might resolve the duplicate consumption issue. It feels to me that there is not a safe-point during rebalancing in the Cooperative Sticky Assignor as compared with the Stop-the-world Range Assignor. Hence, during high traffic ramps, as consumers are busy scaling out and incremental rebalancing takes places, there is a lot of duplicate message production (e.g. >70% as witnessed in our production environment). > Kafka Cooperative Sticky Assignor results in significant duplicate consumption > ------------------------------------------------------------------------------ > > Key: KAFKA-14757 > URL: https://issues.apache.org/jira/browse/KAFKA-14757 > Project: Kafka > Issue Type: Bug > Components: consumer > Affects Versions: 3.1.1 > Environment: AWS MSK (broker) and Spring Kafka (2.8.7) for use in > Spring Boot consumers. > Reporter: Siddharth Anand > Priority: Critical > > Details may be found within the linked document: > [Kafka Cooperative Sticky Assignor Issue : Duplicate Consumption | > [https://docs.google.com/document/d/1E7qAwGOpF8jo_YhF4NwUx9CXxUGJmT8OhHEqIg7-GfI/edit?usp=sharing]] > In a nutshell, we noticed that the Cooperative Sticky Assignor resulted in > significant duplicate message consumption. During last year's F1 Grand Prix > events and World Cup soccer events, our company's Kafka-based platform > received live-traffic. This live traffic, coupled with autoscaled consumers > resulted in as much as 70% duplicate message consumption at the Kafka > consumers. > In December 2022, we ran a synthetic load test to confirm that duplicate > message consumption occurs during consumer scale out/in and Kafka partition > rebalancing when using the Cooperative Sticky Assignor. This issue does not > occur when using the Range Assignor. > -- This message was sent by Atlassian Jira (v8.20.10#820010)