I’ve had the same experience as Liam with this symptom (all followers on a
single broker of a given leader getting stuck). It sounds likely that
either the replica fetcher thread is getting stuck or dying with an
unhandled exception.
The the former case, jstack output can be helpful to understand
Hi Zach,
If you check the cluster's controller's controller.log, do you see broker
2 bouncing in and out of ISRs? There'll be logs to that effect. Or is it
just never getting in-sync in the first place?
Whenever I've had this issue in the past, it's been because the replica
fetcher has died. Hat
Hi Liam,
> Any issues with partitions broker 2 is leader of?
>
Earlier today, broker 2 was not leader of any partitions. At that time, 2
appeared to be in ISRs of all partitions where 1 was leader, but 2 was not
in any ISRs of partitions where 0 was leader.
Currently, broker 2 is leader of 55 p
Hi Zach,
Any issues with partitions broker 2 is leader of?
Also, have you checked b2's server.log?
Cheers,
Liam Clarke-Hutchinson
On Wed, 1 Apr. 2020, 11:02 am Zach Cox, wrote:
> Hi - We have a small Kafka 2.0.0 (Zookeeper 3.4.13) cluster with 3 brokers:
> 0, 1, and 2. Each broker is in a se
Hi - We have a small Kafka 2.0.0 (Zookeeper 3.4.13) cluster with 3 brokers:
0, 1, and 2. Each broker is in a separate rack (Azure zone).
Recently there was an incident, where Kafka brokers and Zookeeper nodes
restarted, etc. After that occurred, we've had problems where broker 2 is
consistently ou