Hello,

we have a kafka cluster of 20 brokers (v0.8.2.1), and we are repeatedly
running into trouble in a maintenance scenario. Each broker node uses 2 HDs
to store the logs in our case (topic replication is 3 all over)

Typical maintenance scenario is that one of the disks on a node fails, so
we stop the broker to get the disk replaced. After HD replacement, half of
the former data is thus missing on the broker. When the node comes online
again, it streams the missing partition data (i.e. mainly that of the
replaced, fresh disk) for some hours.

Our issue is that during that time of recovery, we are consistently running
into instabilities on the side of our consumers (high-level consumer,
kafka-committed offsets). The consumer groups quite often have to
re-balance their partition assignment during this time, leading to hanging
consumption in the end.

If the consumer lag gets too big and we stop the recovering broker again
for some time, or if the recovery of that broker has finally finished,
everything stabilizes again.

Is there some know problem in this respect, or better yet a recommendation
how to deal with it...? Sounds somewhat like the problem mentioned in
https://issues.apache.org/jira/browse/KAFKA-1464.

Our impression is that once the recovering node becomes leader for some of
its partitions already during recovery time, it still isn't able to serve
those partitions properly e.g. due to network saturation. Hence, the broker
seems to periodically gain and loose leadership for those partitions, which
might explain the instabilities / rebalancing of the consumer groups.

Our log output of the state-change logfiles seems to confirm this, i.e. we
do see quite a bit of leadership swapping here, specifically for the
partitions for which the recovering broker should normally be leader for.

Any advice in this matter would be much appreciated.

For example, if there was a way to prevent the recovering node from
aquiring leadership for any partitions, I suppose this could solve our
problems if we'd activate something like that during recovery time
(manually).

Thanks in advance,
Ralph Weires

Reply via email to