Ismael Juma created KAFKA-4479:
----------------------------------

             Summary: Streams tests should pass without hardcoded Time.SYSTEM 
in GroupCoordinator
                 Key: KAFKA-4479
                 URL: https://issues.apache.org/jira/browse/KAFKA-4479
             Project: Kafka
          Issue Type: Improvement
            Reporter: Ismael Juma
            Priority: Minor


If we pass `KafkaServer.time` to `GroupCoordinator`[1], some streams tests like 
QueryableStateIntegrationTest fail sem-regularly. [~damianguy] looked into it 
and described it as:

{quote}
Looking at the sequence of events, one thread is stopped, and hence leaves the 
group triggering a rebalance, but the other thread doesn’t seem to get the 
memo, tries to commit, fails, and then game-over.

So.. the case that it fails the one alive thread is not getting a rebalance. 
This would happen during  a `poll(..)` right? However i can see the thread is 
polling many times after the other thread has shutdown.

It tries to commit every time around the loop, so:

poll(..)
process(..)
maybeCommit(..)

and there is like < 10ms between calls to `poll`.
{quote}

A theory was that the mock time was not advancing enough to trigger a rebalance 
in the group coordinator. However, the consumer is closed, so that should 
trigger a `LeaveGroup` request and it's unclear why a rebalance is not 
triggered for the live consumer.

PR where this issue was first seen and discussed: 
https://github.com/apache/kafka/pull/2095

[1] 
https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/KafkaServer.scala#L222



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to