We can monitor below replica related metrics. Try  tuning "
replica.lag.time.max.ms" , "replica.fetch.max.bytes" .
look for logs starting with "Shrinking ISR for partition ...".

kafka.server:type=ReplicaManager,name=IsrShrinksPerSec
kafka.server:type=ReplicaManager,name=IsrExpandsPerSec
kafka.server:type=ReplicaFetcherManager,name=MaxLag,clientId=Replica
kafka.server:type=FetcherLagMetrics,name=ConsumerLag,clientId=([-.\w]+),topic=([-.\w]+),partition=([0-9]+)

On Thu, Oct 25, 2018 at 7:18 PM Suman B N <sumannew...@gmail.com> wrote:

> Still looking for some response here. Pls assist.
>
> On Sat, Oct 20, 2018 at 12:43 AM Suman B N <sumannew...@gmail.com> wrote:
>
> > Rate of ingestion is not 150-200rps. Its 150k-200k rps.
> >
> > On Fri, Oct 19, 2018 at 11:12 PM Suman B N <sumannew...@gmail.com>
> wrote:
> >
> >> Team,
> >> We have been observing some partitions being under-replicated. Broker
> >> version 0.10.2.1. Below actions were carried out but in vain:
> >>
> >>    - Tried restarting nodes.
> >>    - Tried increasing replica fetcher threads. Recommend ideal replica
> >>    fetcher threads for a 20 node cluster with 150-200rps spread across
> 1000
> >>    topics and 3000 partitions.
> >>    - Tried increasing network threads. (I think this doesn't have any
> >>    effect but still wanted to try). Recommend ideal network threads for
> a 20
> >>    node cluster with 150-200rps spread across 1000 topics and 3000
> partitions.
> >>
> >> Logs look very clean. No exceptions. I don't have much idea on how
> >> replica fetcher threads and logs can be debugged. So asking for help
> here.
> >> Any help or leads would be appreciated.
> >>
> >> --
> >> *Suman*
> >> *OlaCabs*
> >>
> >
> >
> > --
> > *Suman*
> > *OlaCabs*
> >
>
>
> --
> *Suman*
> *OlaCabs*
>

Reply via email to