We've been having quite a few symptoms that appear to be big GC stalls
(nonsensical ZK session timeouts) with the following config:

-Xmx16g
-Xms16g
-server
-XX:+CMSClassUnloadingEnabled
-XX:+CMSScavengeBeforeRemark
-XX:+UseG1GC
-XX:+DisableExplicitGC

Next steps will be to turn on gc logging and try to confirm that the ZK
session timeouts are indeed GC pauses (they look like major
collections), but meanwhile, does anyone have experience around whether
these options (taken from https://kafka.apache.org/081/ops.html) helped?
Would prefer to not just blindly turn on options if possible.

-XX:PermSize=48m
-XX:MaxPermSize=48m
-XX:MaxGCPauseMillis=20
-XX:InitiatingHeapOccupancyPercent=35

Thanks!
Ben Osheroff
Zendesk.com

On Thu, Jun 09, 2016 at 03:52:41PM -0400, Stephen Powis wrote:
> NOTE -- GC tuning is outside the realm of my expertise by all means, so I'm
> not sure I'd use our info as any kind of benchmark.
>
> But in the interest of sharing, we use the following options
>
> export KAFKA_HEAP_OPTS="-Xmx12G -Xms12G"
> >
> > export KAFKA_JVM_PERFORMANCE_OPTS="-server -Djava.awt.headless=true
> > -XX:MaxPermSize=48M -verbose:gc -Xloggc:/var/log/kafka/gc.log
> > -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintTenuringDistribution
> > -XX:+PrintGCApplicationStoppedTime -XX:+PrintTLAB -XX:+DisableExplicitGC
> > -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=100M
> > -XX:+UseCompressedOops -XX:+AlwaysPreTouch -XX:+UseG1GC
> > -XX:MaxGCPauseMillis=20 -XX:+HeapDumpOnOutOfMemoryError
> > -XX:HeapDumpPath=/var/log/kafka/heapDump.log"
> >
>
> You can then take your gc.log files and use an analyzer tool...I've
> attached a link to one of our brokers gclog run thru gceasy.io.
>
> https://protect-us.mimecast.com/s/wXqqBJuqdZb1Tn
>
> On Thu, Jun 9, 2016 at 3:39 PM, Lawrence Weikum <lwei...@pandora.com> wrote:
>
> > Hi Tom,
> >
> > Currently we’re using the default settings – no special tuning
> > whatsoever.  I think the kafka-run-class.sh has this:
> >
> >
> > # Memory options
> > if [ -z "$KAFKA_HEAP_OPTS" ]; then
> >   KAFKA_HEAP_OPTS="-Xmx256M"
> > fi
> >
> > # JVM performance options
> > if [ -z "$KAFKA_JVM_PERFORMANCE_OPTS" ]; then
> >   KAFKA_JVM_PERFORMANCE_OPTS="-server -XX:+UseG1GC -XX:MaxGCPauseMillis=20
> > -XX:InitiatingHeapOccupancyPercent=35 -XX:+DisableExplicitGC
> > -Djava.awt.headless=true"
> > fi
> >
> >
> > Is this the confluent doc you were referring to?
> > https://protect-us.mimecast.com/s/arXXBOspkvORCD
> >
> > Thanks!
> >
> > Lawrence Weikum
> >
> >
> > On 6/9/16, 1:32 PM, "Tom Crayford" <tcrayf...@heroku.com> wrote:
> >
> > >Hi Lawrence,
> > >
> > >What JVM options were you using? There's a few pages in the confluent docs
> > >on JVM tuning iirc. We simply use the G1 and a 4GB Max heap and things
> > work
> > >well (running many thousands of clusters).
> > >
> > >Thanks
> > >Tom Crayford
> > >Heroku Kafka
> > >
> > >On Thursday, 9 June 2016, Lawrence Weikum <lwei...@pandora.com> wrote:
> > >
> > >> Hello all,
> > >>
> > >> We’ve been running a benchmark test on a Kafka cluster of ours running
> > >> 0.9.0.1 – slamming it with messages to see when/if things might break.
> > >> During our test, we caused two brokers to throw OutOfMemory errors
> > (looks
> > >> like from the Heap) even though each machine still has 43% of the total
> > >> memory unused.
> > >>
> > >> I’m curious what JVM optimizations are recommended for Kafka brokers?
> > Or
> > >> if there aren’t any that are recommended, what are some optimizations
> > >> others are using to keep the brokers running smoothly?
> > >>
> > >> Best,
> > >>
> > >> Lawrence Weikum
> > >>
> > >>
> >
> >

Attachment: signature.asc
Description: PGP signature

Reply via email to