BTW, we tried the following Confluent-recommended settings and one broker
crashed after 30 minutes with an out-of-memory error:

-Xms6g -Xmx6g -XX:MetaspaceSize=96m -XX:+UseG1GC -XX:MaxGCPauseMillis=20
       -XX:InitiatingHeapOccupancyPercent=35 -XX:G1HeapRegionSize=16M
       -XX:MinMetaspaceFreeRatio=50 -XX:MaxMetaspaceFreeRatio=80


On Sun, Jul 9, 2017 at 8:13 AM, John Yost <hokiege...@gmail.com> wrote:

> Hey Everyone,
>
> When we originally upgraded from 0.9.0.1 to 0.10.0 with the exact same
> settings we immediately observed OOM errors. I upped the heap size from 6
> GB to 10 GB and that solved the OOM issue. However, I am now seeing that
> the ISR count for all partitions goes from 3 to 1 after about an hour
> following broker start.
>
> Monitoring with jstat it appears that, after about an hour, the young
> generation partition stays at or near 100%, at which point the ISR count
> for each partition goes from 3 to 1 and remains there. There appears to be
> a correlation of high GC activity and replica fetch lag.
>
> I am thinking that GC pauses are the issue, which is a result of
> increasing the memory heap size. But, without increasing the memory heap
> size, we get OOM errors.
>
> Any ideas? There must be a setting somewhere that is causing the memory
> heap to fill up in 0.10.0 that did not affect 0.9.0.1.
>
> Thanks
>
> --John
>

Reply via email to