If only one broker isn't in sync, it can caused by a dead replica fetcher
thread in my experience. I fixed it by restarting the affected broker, but
this was on 0.11, so YMMV.



On Thu, Nov 14, 2019 at 9:35 AM Koushik Chitta
<kchi...@microsoft.com.invalid> wrote:

> The topic partition having the ISR issue might be on a offline directory.
> Look into the metric "offlineLogDirectoryCount" or use  kafka-log-dirs.sh
> to understand the issue with that directory. In most cases, it would be the
> a KafkaStorage Exception.
> The partition reassignment would also be stuck/waiting because of this,
> when the reassignment json contains an offline directory .
>
>
> -----Original Message-----
> From: M. Manna <manme...@gmail.com>
> Sent: Wednesday, November 13, 2019 5:23 AM
> To: Kafka Users <users@kafka.apache.org>
> Subject: Re: Partition Reassignment is getting stuck
>
> On Wed, 13 Nov 2019 at 13:10, Ashutosh singh <getas...@gmail.com> wrote:
>
> > Yeah, Although it wouldn't have any impact but I will have to try this
> > tonight as it is peak business hours now.
> >  Instead deleting all data I will try to delete topic partitions which
> > are having issues and then restart the broker.  I believe it should
> > catch up but I will let you know.
> >
>
>  Since you're doing it OOB hours, it should be fine. The issue you're
> mentioning here is not uncommon, but such occurrence should be close to
> minuscule. As long as you have >=3 replicas you should be able to do this
> comfortably.
>
> Thanks,
>
> >
> >
> >
> > On Wed, Nov 13, 2019 at 6:23 PM M. Manna <manme...@gmail.com> wrote:
> >
> > > On Wed, 13 Nov 2019 at 12:41, Ashutosh singh <getas...@gmail.com>
> wrote:
> > >
> > > > Hi,
> > > >
> > > > All of a  sudden I see under replicated partition in our Kafka
> > > > cluster
> > > and
> > > > it is not getting replicated.  It seems it is getting stuck
> somewhere.
> > In
> > > > sync replica is missing only form one of the broker it seems there
> > > > is
> > > some
> > > > issue with that broker but other hand there are many others topic
> > > > on
> > that
> > > > node and they are working fine.  I have tried rolling restart of
> > > > all
> > the
> > > > nodes in cluster but that didn't help.
> > > > I tried manual reassignment of that particular topic but that is
> > getting
> > > > stuck forever.  So I had to kill the reassignment by deleting
> > > > /admin/reassign_partitions  node.  I restarted zookeeper so that
> > > > leader gets change and then tried to reassign partitions but again
> > > > it is
> > getting
> > > > stuck.
> > > >
> > > > I really appreciate if someone can help to understand the issue.
> > > >
> > >
> > > If all you have is 1 broker not in sync - can you please try to stop
> > > that broker, delete all the data files on that broker, and restart?
> > > It should catch up.
> > >
> > >
> > > >
> > > > No of nodes : 8
> > > > Version : 2.1.1
> > > >
> > > > --
> > > > Thanks
> > > > Ashu
> > > >
> > >
> >
> >
> > --
> > Thanx & Regard
> > Ashutosh Singh
> > 08151945559
> >
>

Reply via email to