[ https://issues.apache.org/jira/browse/KAFKA-16012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Justine Olshan resolved KAFKA-16012. ------------------------------------ Resolution: Fixed > Incomplete range assignment in consumer > --------------------------------------- > > Key: KAFKA-16012 > URL: https://issues.apache.org/jira/browse/KAFKA-16012 > Project: Kafka > Issue Type: Bug > Reporter: Jason Gustafson > Assignee: Philip Nee > Priority: Blocker > Fix For: 3.7.0 > > > We were looking into test failures here: > https://confluent-kafka-branch-builder-system-test-results.s3-us-west-2.amazonaws.com/system-test-kafka-branch-builder--1702475525--jolshan--kafka-15784--7cad567675/2023-12-13--001./2023-12-13–001./report.html. > > Here is the first failure in the report: > {code:java} > ==================================================================================================== > test_id: > kafkatest.tests.core.group_mode_transactions_test.GroupModeTransactionsTest.test_transactions.failure_mode=clean_bounce.bounce_target=brokers > status: FAIL > run time: 3 minutes 4.950 seconds > TimeoutError('Consumer consumed only 88223 out of 100000 messages in > 90s') {code} > > We traced the failure to an apparent bug during the last rebalance before the > group became empty. The last remaining instance seems to receive an > incomplete assignment which prevents it from completing expected consumption > on some partitions. Here is the rebalance from the coordinator's perspective: > {code:java} > server.log.2023-12-13-04:[2023-12-13 04:58:56,987] INFO [GroupCoordinator 3]: > Stabilized group grouped-transactions-test-consumer-group generation 5 > (__consumer_offsets-2) with 1 members > (kafka.coordinator.group.GroupCoordinator) > server.log.2023-12-13-04:[2023-12-13 04:58:56,990] INFO [GroupCoordinator 3]: > Assignment received from leader > consumer-grouped-transactions-test-consumer-group-1-2164f472-93f3-4176-af3f-23d4ed8b37fd > for group grouped-transactions-test-consumer-group for generation 5. The > group has 1 members, 0 of which are static. > (kafka.coordinator.group.GroupCoordinator) {code} > The group is down to one member in generation 5. In the previous generation, > the consumer in question reported this assignment: > {code:java} > // Gen 4: we've got partitions 0-4 > [2023-12-13 04:58:52,631] DEBUG [Consumer > clientId=consumer-grouped-transactions-test-consumer-group-1, > groupId=grouped-transactions-test-consumer-group] Executing onJoinComplete > with generation 4 and memberId > consumer-grouped-transactions-test-consumer-group-1-2164f472-93f3-4176-af3f-23d4ed8b37fd > (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator) > [2023-12-13 04:58:52,631] INFO [Consumer > clientId=consumer-grouped-transactions-test-consumer-group-1, > groupId=grouped-transactions-test-consumer-group] Notifying assignor about > the new Assignment(partitions=[input-topic-0, input-topic-1, input-topic-2, > input-topic-3, input-topic-4]) > (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator) {code} > However, in generation 5, we seem to be assigned only one partition: > {code:java} > // Gen 5: Now we have only partition 1? But aren't we the last member in the > group? > [2023-12-13 04:58:56,954] DEBUG [Consumer > clientId=consumer-grouped-transactions-test-consumer-group-1, > groupId=grouped-transactions-test-consumer-group] Executing onJoinComplete > with generation 5 and memberId > consumer-grouped-transactions-test-consumer-group-1-2164f472-93f3-4176-af3f-23d4ed8b37fd > (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator) > [2023-12-13 04:58:56,955] INFO [Consumer > clientId=consumer-grouped-transactions-test-consumer-group-1, > groupId=grouped-transactions-test-consumer-group] Notifying assignor about > the new Assignment(partitions=[input-topic-1]) > (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator) {code} > The assignment type is range from the JoinGroup for generation 5. The decoded > metadata sent by the consumer is this: > {code:java} > Subscription(topics=[input-topic], ownedPartitions=[], groupInstanceId=null, > generationId=4, rackId=null) {code} > Here is the decoded assignment from the SyncGroup: > {code:java} > Assignment(partitions=[input-topic-1]) {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010)