It seems that one of the brokers somehow had a high CPU utilization, like 5
of the brokers had 15%, and one had 100% utilization.
After I added more CPUs to that broker with 100% CPUs utilization, the
issue solved itself.

Peter

On Thu, 20 Feb 2020 at 10:54, Péter Sinóros-Szabó <
peter.sinoros-sz...@transferwise.com> wrote:

> Hi,
>
> we use Kafka 1.1.1, recently I faced with an issue/bug I can't see how to
> solve.
> We have a service running two instances of it, using the same consumer
> group id to access some topics. When the service starts and it starts to
> join the consumer group, the join does not succeed.
>
> The application get error messages like:
>
> Accepting Kafka message from topic 'myTopic', partition 0, offset 383554 
> failed.
> Attempt to heartbeat failed since group is rebalancing
>
>
> On the broker, I see:
> ./kafka-consumer-groups.sh  ... --group self-service --describe --state
> COORDINATOR (ID)          ASSIGNMENT-STRATEGY       STATE
>  #MEMBERS
> 172.3.xx.yy:9092 (1006)                           PreparingRebalance   1
>
> And it stucks there.
>
> In the server logs I see the same logs repeating continuously:
> [2020-02-20 09:49:32,395] INFO [GroupCoordinator 1006]: Stabilized group
> self-service generation 192346 (__consumer_offsets-32)
> (kafka.coordinator.group.GroupCoordinator)
> [2020-02-20 09:49:32,396] INFO [GroupCoordinator 1006]: Assignment
> received from leader for group self-service for generation 192346
> (kafka.coordinator.group.GroupCoordinator)
> [2020-02-20 09:49:32,406] INFO [GroupCoordinator 1006]: Preparing to
> rebalance group self-service with old generation 192346
> (__consumer_offsets-32) (kafka.coordinator.group.GroupCoordinator)
> [2020-02-20 09:49:32,406] INFO [GroupCoordinator 1006]: Group self-service
> with generation 192347 is now empty (__consumer_offsets-32)
> (kafka.coordinator.group.GroupCoordinator)
> [2020-02-20 09:49:33,722] INFO [GroupCoordinator 1006]: Preparing to
> rebalance group self-service with old generation 192347
> (__consumer_offsets-32) (kafka.coordinator.group.GroupCoordinator)
> [2020-02-20 09:49:36,723] INFO [GroupCoordinator 1006]: Stabilized group
> self-service generation 192348 (__consumer_offsets-32)
> (kafka.coordinator.group.GroupCoordinator)
> [2020-02-20 09:49:36,724] INFO [GroupCoordinator 1006]: Assignment
> received from leader for group self-service for generation 192348
> (kafka.coordinator.group.GroupCoordinator)
> [2020-02-20 09:49:36,734] INFO [GroupCoordinator 1006]: Preparing to
> rebalance group self-service with old generation 192348
> (__consumer_offsets-32) (kafka.coordinator.group.GroupCoordinator)
> [2020-02-20 09:49:36,734] INFO [GroupCoordinator 1006]: Group self-service
> with generation 192349 is now empty (__consumer_offsets-32)
> (kafka.coordinator.group.GroupCoordinator)
> [2020-02-20 09:49:37,419] INFO [GroupCoordinator 1006]: Preparing to
> rebalance group self-service with old generation 192349
> (__consumer_offsets-32) (kafka.coordinator.group.GroupCoordinator)
> [2020-02-20 09:49:40,419] INFO [GroupCoordinator 1006]: Stabilized group
> self-service generation 192350 (__consumer_offsets-32)
> (kafka.coordinator.group.GroupCoordinator)
>
> What should I do to fix it? I tried restarting all brokers, the service
> several times, but it always end up in this state.
> The same setup works fine in another environment just fine.
>
> Thanks,
> Peter
>

Reply via email to