Hey Ben Using G1 with those settings appears to be working well for us. Infrequent younggen/minor GCs averaging a run time of 12ms, no full GCs in the 24 hours logged that I uploaded. I'd say enable the GC log flags and let it run for a bit, then change a setting or two and compare.
On Thu, Jun 9, 2016 at 3:59 PM, Ben Osheroff <b...@zendesk.com.invalid> wrote: > We've been having quite a few symptoms that appear to be big GC stalls > (nonsensical ZK session timeouts) with the following config: > > -Xmx16g > -Xms16g > -server > -XX:+CMSClassUnloadingEnabled > -XX:+CMSScavengeBeforeRemark > -XX:+UseG1GC > -XX:+DisableExplicitGC > > Next steps will be to turn on gc logging and try to confirm that the ZK > session timeouts are indeed GC pauses (they look like major > collections), but meanwhile, does anyone have experience around whether > these options (taken from https://kafka.apache.org/081/ops.html) helped? > Would prefer to not just blindly turn on options if possible. > > -XX:PermSize=48m > -XX:MaxPermSize=48m > -XX:MaxGCPauseMillis=20 > -XX:InitiatingHeapOccupancyPercent=35 > > Thanks! > Ben Osheroff > Zendesk.com > > On Thu, Jun 09, 2016 at 03:52:41PM -0400, Stephen Powis wrote: > > NOTE -- GC tuning is outside the realm of my expertise by all means, so > I'm > > not sure I'd use our info as any kind of benchmark. > > > > But in the interest of sharing, we use the following options > > > > export KAFKA_HEAP_OPTS="-Xmx12G -Xms12G" > > > > > > export KAFKA_JVM_PERFORMANCE_OPTS="-server -Djava.awt.headless=true > > > -XX:MaxPermSize=48M -verbose:gc -Xloggc:/var/log/kafka/gc.log > > > -XX:+PrintGCDateStamps -XX:+PrintGCDetails > -XX:+PrintTenuringDistribution > > > -XX:+PrintGCApplicationStoppedTime -XX:+PrintTLAB > -XX:+DisableExplicitGC > > > -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 > -XX:GCLogFileSize=100M > > > -XX:+UseCompressedOops -XX:+AlwaysPreTouch -XX:+UseG1GC > > > -XX:MaxGCPauseMillis=20 -XX:+HeapDumpOnOutOfMemoryError > > > -XX:HeapDumpPath=/var/log/kafka/heapDump.log" > > > > > > > You can then take your gc.log files and use an analyzer tool...I've > > attached a link to one of our brokers gclog run thru gceasy.io. > > > > https://protect-us.mimecast.com/s/wXqqBJuqdZb1Tn > > > > On Thu, Jun 9, 2016 at 3:39 PM, Lawrence Weikum <lwei...@pandora.com> > wrote: > > > > > Hi Tom, > > > > > > Currently we’re using the default settings – no special tuning > > > whatsoever. I think the kafka-run-class.sh has this: > > > > > > > > > # Memory options > > > if [ -z "$KAFKA_HEAP_OPTS" ]; then > > > KAFKA_HEAP_OPTS="-Xmx256M" > > > fi > > > > > > # JVM performance options > > > if [ -z "$KAFKA_JVM_PERFORMANCE_OPTS" ]; then > > > KAFKA_JVM_PERFORMANCE_OPTS="-server -XX:+UseG1GC > -XX:MaxGCPauseMillis=20 > > > -XX:InitiatingHeapOccupancyPercent=35 -XX:+DisableExplicitGC > > > -Djava.awt.headless=true" > > > fi > > > > > > > > > Is this the confluent doc you were referring to? > > > https://protect-us.mimecast.com/s/arXXBOspkvORCD > > > > > > Thanks! > > > > > > Lawrence Weikum > > > > > > > > > On 6/9/16, 1:32 PM, "Tom Crayford" <tcrayf...@heroku.com> wrote: > > > > > > >Hi Lawrence, > > > > > > > >What JVM options were you using? There's a few pages in the confluent > docs > > > >on JVM tuning iirc. We simply use the G1 and a 4GB Max heap and things > > > work > > > >well (running many thousands of clusters). > > > > > > > >Thanks > > > >Tom Crayford > > > >Heroku Kafka > > > > > > > >On Thursday, 9 June 2016, Lawrence Weikum <lwei...@pandora.com> > wrote: > > > > > > > >> Hello all, > > > >> > > > >> We’ve been running a benchmark test on a Kafka cluster of ours > running > > > >> 0.9.0.1 – slamming it with messages to see when/if things might > break. > > > >> During our test, we caused two brokers to throw OutOfMemory errors > > > (looks > > > >> like from the Heap) even though each machine still has 43% of the > total > > > >> memory unused. > > > >> > > > >> I’m curious what JVM optimizations are recommended for Kafka > brokers? > > > Or > > > >> if there aren’t any that are recommended, what are some > optimizations > > > >> others are using to keep the brokers running smoothly? > > > >> > > > >> Best, > > > >> > > > >> Lawrence Weikum > > > >> > > > >> > > > > > > >