Hi Ted, I've posted this question on a kafka-user google group as well. Here is the link <https://groups.google.com/forum/#!topic/kafka-clients/AeglVfsRCak>. It has the attachments as well.
Thanks, Avinash On Tue, 23 Jan 2018 at 17:23 Ted Yu <yuzhih...@gmail.com> wrote: > Did you attach two .png files ? > > Please use third party site since the attachment didn't come thru. > > On Tue, Jan 23, 2018 at 5:20 PM, Avinash Herle <avinash.herl...@gmail.com> > wrote: > > > > > Hi, > > > > I'm using Kafka as a messaging system in my data pipeline. I've a couple > > of producer processes in my pipeline and Spark Streaming > > < > https://spark.apache.org/docs/2.2.1/streaming-kafka-0-10-integration.html> > > and Druid's Kafka indexing service > > < > http://druid.io/docs/latest/development/extensions-core/kafka-ingestion.html > > > > as consumers of Kafka. The indexing service spawns 40 new indexing tasks > > (Kafka consumers) every 15 mins. > > > > The heap memory used on Kafka seems fairly constant for an hour after > > which it seems to shoot up to the max allocated space. The garbage > > collection logs of Kafka seems to indicate a memory leak in Kafka. Find > > attached the plots generated from the GC logs. > > > > *Kafka Deployment:* > > 3 nodes, with 3 topics and 64 partitions per topic > > > > *Kafka Runtime jvm parameters:* > > 8GB Heap Memory > > 1GC swap Memory > > Using G1GC > > MaxGCPauseMilllis=20 > > InitiatingHeapOccupancyPercent=35 > > > > *Kafka Versions Used:* > > I've used Kafka version 0.10.0, 0.11.0.2 and 1.0.0 and find similar > > behavior > > > > *Questions:* > > 1) Is this a memory leak on the Kafka side or a misconfiguration of my > > Kafka cluster? > > 2) Druid creates new indexing tasks periodically. Does Kafka stably > handle > > large number of consumers being added periodically? > > 3) As a knock on effect, We also notice kafka partitions going offline > > periodically after some time with the following error: > > ERROR [ReplicaFetcherThread-18-2], Error for partition [topic1,2] to > > broker > 2:*org.apache.kafka.common.errors.UnknownTopicOrPartitionException*: > > This server does not host this topic-partition. (kafka.server. > > ReplicaFetcherThread) > > > > Can someone shed some light on the behavior being seen in my cluster? > > > > Please let me know if more details are needed to root cause the behavior > > being seen. > > > > Thanks in advance. > > > > Avinash > > [image: Screen Shot 2018-01-23 at 2.29.04 PM.png][image: Screen Shot > > 2018-01-23 at 2.29.21 PM.png] > > > > > > > > > > -- > > > > Excuse brevity and typos. Sent from mobile device. > > > > > > -- > > > > Excuse brevity and typos. Sent from mobile device. > > > -- Excuse brevity and typos. Sent from mobile device.