Hi Zach, If you check the cluster's controller's controller.log, do you see broker 2 bouncing in and out of ISRs? There'll be logs to that effect. Or is it just never getting in-sync in the first place?
Whenever I've had this issue in the past, it's been because the replica fetcher has died. Hate to say this, but have tried turning broker 2 on and off again? It's usually how I've resolved this issue when a broker won't stay in ISR. Also make sure that there's enough CPU/network on the machine it's running on - we've usually had this issue where CPU was very high or the network saturated. Cheers, Liam Clarke-Hutchinson On Thu, Apr 2, 2020 at 8:51 AM Zach Cox <zcox...@gmail.com> wrote: > Hi Liam, > > > > Any issues with partitions broker 2 is leader of? > > > > Earlier today, broker 2 was not leader of any partitions. At that time, 2 > appeared to be in ISRs of all partitions where 1 was leader, but 2 was not > in any ISRs of partitions where 0 was leader. > > Currently, broker 2 is leader of 55 partitions, but does not appear to be > in ISRs of any other partitions, whether 0 or 1 is leader. > > > > Also, have you checked b2's server.log? > > > > We don't see any logs that obviously indicate the problem, although we're > also not sure what things we should be looking for. There are a few > Zookeeper client timeouts, but haven't correlated that with anything yet. > > Thanks, > Zach >