Hi Garvit, Are the consumers java? If so, can you take a thread dump every five seconds for 30 seconds total for the affected consumer JVM?
Thanks, Steve On Thu, Jun 27, 2019 at 12:00 AM Garvit Sharma <eng.gar...@gmail.com> wrote: > I don't think that is the case. The lag is huge ~10^5 records. > > On Thu, Jun 27, 2019 at 9:13 AM Srinath C <srinat...@gmail.com> wrote: > > > Ok Garvit I still don't see the image but based on these inputs you > > provided I'm thinking that the possible scenario could be that between > two > > polls from the consumer: > > (a) the number of records added to the partitions already consumed in > > previous poll is 500 or more (max.poll.records) > > or > > (b) the size of records in the partitions already consumed in previous > poll > > is around ~50Mb (fetch.max.bytes) > > > > Regards, > > Srinath. > > > > > > On Thu, Jun 27, 2019 at 7:41 AM Garvit Sharma <eng.gar...@gmail.com> > > wrote: > > > > > Hi Srinath, > > > > > > I have attached the image. > > > > > > The partitions belong to the same topic only. I have not explicitly set > > > max.partition.fetch.bytes or fetch.max.bytes or max.poll.records so it > > > should take the default values. > > > > > > Let me know. > > > > > > Thanks, > > > > > > On Thu, Jun 27, 2019 at 7:11 AM Srinath C <srinat...@gmail.com> wrote: > > > > > >> Hi Garvit, > > >> > > >> Am unable to see the image you attached for some reason and am not > able > > to > > >> see if the partitions are in the same topic or in different topics. > > >> Check if any of max.partition.fetch.bytes or fetch.max.bytes or > > >> max.poll.records configured in your consumer is causing the behaviour. > > >> > > >> Regards, > > >> Srinath > > >> > > >> > > >> On Thu, Jun 27, 2019 at 5:43 AM Peter Bukowinski <pmb...@gmail.com> > > >> wrote: > > >> > > >> > Is there a correlation between the lagging partitions and the > consumer > > >> > assigned to them? > > >> > > > >> > > On Jun 26, 2019, at 4:25 PM, Garvit Sharma <eng.gar...@gmail.com> > > >> wrote: > > >> > > > > >> > > Can anyone please help me with this. > > >> > > > > >> > > On Wed, Jun 26, 2019 at 8:56 PM Garvit Sharma < > eng.gar...@gmail.com > > > > > >> > wrote: > > >> > > > > >> > >> Hey Steve, > > >> > >> > > >> > >> I have checked, count of messages on all the partitions are same. > > >> > >> > > >> > >> I am still exploring an approach using which the root cause could > > be > > >> > >> determined. > > >> > >> > > >> > >> Thanks, > > >> > >> > > >> > >> On Wed, Jun 26, 2019 at 8:07 PM Garvit Sharma < > > eng.gar...@gmail.com> > > >> > >> wrote: > > >> > >> > > >> > >>> I am not sure about that. Is there a way to analyse that ? > > >> > >>> > > >> > >>> On Wed, Jun 26, 2019 at 7:35 PM Steve Howard < > > >> > steve.how...@confluent.io> > > >> > >>> wrote: > > >> > >>> > > >> > >>>> Hi Garvit, > > >> > >>>> > > >> > >>>> Are the slow partitions "hot", i.e., receiving a lot more > > messages > > >> > than > > >> > >>>> others? > > >> > >>>> > > >> > >>>> Thanks, > > >> > >>>> > > >> > >>>> Steve > > >> > >>>> > > >> > >>>> On Wed, Jun 26, 2019, 9:56 AM Garvit Sharma < > > eng.gar...@gmail.com > > >> > wrote: > > >> > >>>> > > >> > >>>>> Just to add more details, these consumers are processing the > > Kafka > > >> > >>>> events > > >> > >>>>> and writing to DB(fast write guaranteed). > > >> > >>>>> > > >> > >>>>> On Wed, Jun 26, 2019 at 7:23 PM Garvit Sharma < > > >> eng.gar...@gmail.com> > > >> > >>>>> wrote: > > >> > >>>>> > > >> > >>>>>> Hi All, > > >> > >>>>>> > > >> > >>>>>> I can see huge consumer lag in a few partitions of Kafka > > topic. I > > >> > >>>> need to > > >> > >>>>>> know the root cause of this issue. > > >> > >>>>>> > > >> > >>>>>> Please let me know, how to proceed. > > >> > >>>>>> > > >> > >>>>>> Below is sample consumer lag data : > > >> > >>>>>> > > >> > >>>>>> [image: image.png] > > >> > >>>>>> > > >> > >>>>>> Thanks, > > >> > >>>>>> > > >> > >>>>>> > > >> > >>>>> > > >> > >>>> > > >> > >>> > > >> > > > >> > > > >> > > > > > >