[ 
https://issues.apache.org/jira/browse/KAFKA-12455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17301253#comment-17301253
 ] 

Ron Dagostino commented on KAFKA-12455:
---------------------------------------

With a 2-broker cluster that undergoes a series of 5 rolling restarts (which is 
what is happening here), in the Raft case, the consumers sometimes receive a 
`MetadataResponse` that has only a single broker since the other broker is 
restarting.  This never happens in the Zookeeper case -- every received 
`MetadataResponse` in that case always lists both brokers.  I'm not sure why 
this would be the case in the ZooKeeper configuration, but that is the 
fundamental difference between the two cases in this test scenario: in the Raft 
configuration the consumer sometimes sees `METADATA` responses with just a 
single broker, and in the ZooKeeper scenario this never happens.  The problem 
with the consumer seeing only a single broker in the `METADATA` response for 
the Raft configuration is that when that broker that it knows about goes down 
the consumer suddenly has no available brokers that it knows about, and we see 
messages in the consumer log saying `Give up sending metadata request since no 
node is available`.  It then takes a while before the only broker that the 
consumer knows about restarts, and by that time the consumer group has already 
moved to the `GroupCoordinator` on the other broker (the one that the consumer 
didn't know about), and that coordinator fails the consumer due to a lack of a 
heartbeat -- thus a rebalance happens, and this test is specifically checking 
to make sure no rebalances occur during the rolling restarts.

> OffsetValidationTest.test_broker_rolling_bounce failing for Raft quorums
> ------------------------------------------------------------------------
>
>                 Key: KAFKA-12455
>                 URL: https://issues.apache.org/jira/browse/KAFKA-12455
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 2.8.0
>            Reporter: Ron Dagostino
>            Assignee: Ron Dagostino
>            Priority: Blocker
>
> OffsetValidationTest.test_broker_rolling_bounce in `consumer_test.py` is 
> failing because the consumer group is rebalancing unexpectedly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to