If only one broker isn't in sync, it can caused by a dead replica fetcher thread in my experience. I fixed it by restarting the affected broker, but this was on 0.11, so YMMV.
On Thu, Nov 14, 2019 at 9:35 AM Koushik Chitta <kchi...@microsoft.com.invalid> wrote: > The topic partition having the ISR issue might be on a offline directory. > Look into the metric "offlineLogDirectoryCount" or use kafka-log-dirs.sh > to understand the issue with that directory. In most cases, it would be the > a KafkaStorage Exception. > The partition reassignment would also be stuck/waiting because of this, > when the reassignment json contains an offline directory . > > > -----Original Message----- > From: M. Manna <manme...@gmail.com> > Sent: Wednesday, November 13, 2019 5:23 AM > To: Kafka Users <users@kafka.apache.org> > Subject: Re: Partition Reassignment is getting stuck > > On Wed, 13 Nov 2019 at 13:10, Ashutosh singh <getas...@gmail.com> wrote: > > > Yeah, Although it wouldn't have any impact but I will have to try this > > tonight as it is peak business hours now. > > Instead deleting all data I will try to delete topic partitions which > > are having issues and then restart the broker. I believe it should > > catch up but I will let you know. > > > > Since you're doing it OOB hours, it should be fine. The issue you're > mentioning here is not uncommon, but such occurrence should be close to > minuscule. As long as you have >=3 replicas you should be able to do this > comfortably. > > Thanks, > > > > > > > > > On Wed, Nov 13, 2019 at 6:23 PM M. Manna <manme...@gmail.com> wrote: > > > > > On Wed, 13 Nov 2019 at 12:41, Ashutosh singh <getas...@gmail.com> > wrote: > > > > > > > Hi, > > > > > > > > All of a sudden I see under replicated partition in our Kafka > > > > cluster > > > and > > > > it is not getting replicated. It seems it is getting stuck > somewhere. > > In > > > > sync replica is missing only form one of the broker it seems there > > > > is > > > some > > > > issue with that broker but other hand there are many others topic > > > > on > > that > > > > node and they are working fine. I have tried rolling restart of > > > > all > > the > > > > nodes in cluster but that didn't help. > > > > I tried manual reassignment of that particular topic but that is > > getting > > > > stuck forever. So I had to kill the reassignment by deleting > > > > /admin/reassign_partitions node. I restarted zookeeper so that > > > > leader gets change and then tried to reassign partitions but again > > > > it is > > getting > > > > stuck. > > > > > > > > I really appreciate if someone can help to understand the issue. > > > > > > > > > > If all you have is 1 broker not in sync - can you please try to stop > > > that broker, delete all the data files on that broker, and restart? > > > It should catch up. > > > > > > > > > > > > > > No of nodes : 8 > > > > Version : 2.1.1 > > > > > > > > -- > > > > Thanks > > > > Ashu > > > > > > > > > > > > > -- > > Thanx & Regard > > Ashutosh Singh > > 08151945559 > > >