Total shot in the dark but could it be related, this talks about CPU but
could have an impact on memory as well:
http://kafka.apache.org/0102/documentation.html#upgrade_10_performance_impact

Hope this helps.


On Sun, 9 Jul 2017 at 10:45 John Yost <hokiege...@gmail.com> wrote:

> Hey Ismael,
>
> Thanks a bunch for responding so quickly--really appreciate the follow-up!
> I will have to get those details tomorrow when I return to the office.
>
> Thanks again, will forward details ASAP tomorrow.
>
> --John
>
> On Sun, Jul 9, 2017 at 10:41 AM, Ismael Juma <ism...@juma.me.uk> wrote:
>
> > Hi John,
> >
> > We would need more details to be able to help. What is the version of
> your
> > producers and consumers, is compression being used (and the compression
> > type if it is) and what is the broker/topic message format version?
> >
> > Ismael
> >
> > On Sun, Jul 9, 2017 at 1:13 PM, John Yost <hokiege...@gmail.com> wrote:
> >
> > > Hey Everyone,
> > >
> > > When we originally upgraded from 0.9.0.1 to 0.10.0 with the exact same
> > > settings we immediately observed OOM errors. I upped the heap size
> from 6
> > > GB to 10 GB and that solved the OOM issue. However, I am now seeing
> that
> > > the ISR count for all partitions goes from 3 to 1 after about an hour
> > > following broker start.
> > >
> > > Monitoring with jstat it appears that, after about an hour, the young
> > > generation partition stays at or near 100%, at which point the ISR
> count
> > > for each partition goes from 3 to 1 and remains there. There appears to
> > be
> > > a correlation of high GC activity and replica fetch lag.
> > >
> > > I am thinking that GC pauses are the issue, which is a result of
> > increasing
> > > the memory heap size. But, without increasing the memory heap size, we
> > get
> > > OOM errors.
> > >
> > > Any ideas? There must be a setting somewhere that is causing the memory
> > > heap to fill up in 0.10.0 that did not affect 0.9.0.1.
> > >
> > > Thanks
> > >
> > > --John
> > >
> >
>

Reply via email to