Hey Ismael,

Thanks a bunch for responding so quickly--really appreciate the follow-up!
I will have to get those details tomorrow when I return to the office.

Thanks again, will forward details ASAP tomorrow.

--John

On Sun, Jul 9, 2017 at 10:41 AM, Ismael Juma <ism...@juma.me.uk> wrote:

> Hi John,
>
> We would need more details to be able to help. What is the version of your
> producers and consumers, is compression being used (and the compression
> type if it is) and what is the broker/topic message format version?
>
> Ismael
>
> On Sun, Jul 9, 2017 at 1:13 PM, John Yost <hokiege...@gmail.com> wrote:
>
> > Hey Everyone,
> >
> > When we originally upgraded from 0.9.0.1 to 0.10.0 with the exact same
> > settings we immediately observed OOM errors. I upped the heap size from 6
> > GB to 10 GB and that solved the OOM issue. However, I am now seeing that
> > the ISR count for all partitions goes from 3 to 1 after about an hour
> > following broker start.
> >
> > Monitoring with jstat it appears that, after about an hour, the young
> > generation partition stays at or near 100%, at which point the ISR count
> > for each partition goes from 3 to 1 and remains there. There appears to
> be
> > a correlation of high GC activity and replica fetch lag.
> >
> > I am thinking that GC pauses are the issue, which is a result of
> increasing
> > the memory heap size. But, without increasing the memory heap size, we
> get
> > OOM errors.
> >
> > Any ideas? There must be a setting somewhere that is causing the memory
> > heap to fill up in 0.10.0 that did not affect 0.9.0.1.
> >
> > Thanks
> >
> > --John
> >
>

Reply via email to