[jira] [Commented] (KAFKA-12455) OffsetValidationTest.test_broker_rolling_bounce failing for Raft quorums

Ron Dagostino (Jira) Mon, 15 Mar 2021 13:24:04 -0700


    [ 
https://issues.apache.org/jira/browse/KAFKA-12455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17301952#comment-17301952
 ]


Ron Dagostino commented on KAFKA-12455:
---------------------------------------

I looked at the brokers' metadata caches for the two separate configurations -- 
ZK vs. Raft -- to find out what percentage of the time they showed showed 1 
alive broker instead of 2.  I was expecting the ZooKeeper configuration to show 
relatively little time with just 1 alive broker since the clients are never 
seeing that metadata situation, and I was expecting the Raft configuration to 
show a much higher percentage of time with just 1 alive broker since the 
clients do see that metadata situation.  I did not find what I was expecting to 
find.

The amounts of times where the brokers are advertising just 1 alive broker in 
their metadata cache as follows:

*ZooKeeper Configuration*:
    BrokerId=1: 37 seconds out of 61 seconds of that broker's availability 
during the test, or 61% of the time with just 1 alive broker in metadata cache
    BrokerId=2: 39 seconds out of 61 seconds of that broker's availability 
during the test, or 64% of the time with just 1 alive broker in metadata cache

*Raft Configuration*:
    BrokerId=1: 37 seconds out of 88 seconds of that broker's availability 
during the test, or 42% of the time with just 1 alive broker in metadata cache
    BrokerId=2: 52 seconds out of 88 seconds of that broker's availability 
during the test, or 59% of the time with just 1 alive broker in metadata cache

So the brokers in the Zookeeper configuration consider just 1 broker to be 
alive more often than the brokers in the Raft configuration consider just 1 
broker to be alive!

It is still not clear why the consumers never see just a single alive broker in 
the ZooKeeper configuration.  From the above it does not appear to be due to 
any difference in metadata cache population -- if it were just that then we 
would see the test failing in the ZooKeeper configuration since that actually 
advertises a single alive broker more frequently in terms of percentage of test 
time.



> OffsetValidationTest.test_broker_rolling_bounce failing for Raft quorums
> ------------------------------------------------------------------------
>
>                 Key: KAFKA-12455
>                 URL: https://issues.apache.org/jira/browse/KAFKA-12455
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 2.8.0
>            Reporter: Ron Dagostino
>            Assignee: Ron Dagostino
>            Priority: Blocker
>
> OffsetValidationTest.test_broker_rolling_bounce in `consumer_test.py` is 
> failing because the consumer group is rebalancing unexpectedly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (KAFKA-12455) OffsetValidationTest.test_broker_rolling_bounce failing for Raft quorums

Reply via email to