[
https://issues.apache.org/jira/browse/KAFKA-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16583570#comment-16583570
]
Ted Yu commented on KAFKA-7304:
-------------------------------
Looking at the close() method, I don't see where the channels in
closingChannels are closed.
{code}
diff --git
a/clients/src/main/java/org/apache/kafka/common/network/Selector.java
b/clients/src/main/java/org/apache/kafka/common/network/Selector.java
index 7e32509..2164a40 100644
--- a/clients/src/main/java/org/apache/kafka/common/network/Selector.java
+++ b/clients/src/main/java/org/apache/kafka/common/network/Selector.java
@@ -320,6 +320,10 @@ public class Selector implements Selectable, AutoCloseable
{
}
sensors.close();
channelBuilder.close();
+ for (Map.Entry<String, KafkaChannel> entry :
this.closingChannels.entrySet()) {
+ doClose(entry.getValue(), false);
+ }
+ this.closingChannels.clear();
}
/**
{code}
I wonder if the above change would fix the leakage.
> memory leakage in org.apache.kafka.common.network.Selector
> ----------------------------------------------------------
>
> Key: KAFKA-7304
> URL: https://issues.apache.org/jira/browse/KAFKA-7304
> Project: Kafka
> Issue Type: Bug
> Components: core
> Affects Versions: 1.1.0, 1.1.1
> Reporter: Yu Yang
> Priority: Major
> Attachments: Screen Shot 2018-08-16 at 11.04.16 PM.png, Screen Shot
> 2018-08-16 at 11.06.38 PM.png, Screen Shot 2018-08-16 at 12.41.26 PM.png,
> Screen Shot 2018-08-16 at 4.26.19 PM.png, Screen Shot 2018-08-17 at 1.03.35
> AM.png, Screen Shot 2018-08-17 at 1.04.32 AM.png, Screen Shot 2018-08-17 at
> 1.05.30 AM.png
>
>
> We are testing secured writing to kafka through ssl. Testing at small scale,
> ssl writing to kafka was fine. However, when we enabled ssl writing at a
> larger scale (>40k clients writes concurrently), the kafka brokers soon hit
> OutOfMemory issue with 4G memory setting. We have tried with increasing the
> heap size to 10Gb, but encountered the same issue.
> We took a few heap dump , and found that most of the heap memory is
> referenced through org.apache.kafka.common.network.Selector object. There
> are two Channel maps field in Selector. It seems that somehow the objects is
> not deleted from the map in a timely manner.
> One observation is that the memory leak seems relate to kafka partition
> leader changes. If there is broker restart etc. in the cluster that caused
> partition leadership change, the brokers may hit the OOM issue faster.
> {code}
> private final Map<String, KafkaChannel> channels;
> private final Map<String, KafkaChannel> closingChannels;
> {code}
> Please see the attached images and the following link for sample gc
> analysis.
> http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0
> the command line for running kafka:
> {code}
> java -Xms10g -Xmx10g -XX:NewSize=512m -XX:MaxNewSize=512m
> -Xbootclasspath/p:/usr/local/libs/bcp -XX:MetaspaceSize=128m -XX:+UseG1GC
> -XX:MaxGCPauseMillis=25 -XX:InitiatingHeapOccupancyPercent=35
> -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=25
> -XX:MaxMetaspaceFreeRatio=75 -XX:+PrintGCDetails -XX:+PrintGCDateStamps
> -XX:+PrintTenuringDistribution -Xloggc:/var/log/kafka/gc.log
> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=40 -XX:GCLogFileSize=50M
> -Djava.awt.headless=true
> -Dlog4j.configuration=file:/etc/kafka/log4j.properties
> -Dcom.sun.management.jmxremote
> -Dcom.sun.management.jmxremote.authenticate=false
> -Dcom.sun.management.jmxremote.ssl=false
> -Dcom.sun.management.jmxremote.port=9999
> -Dcom.sun.management.jmxremote.rmi.port=9999 -cp /usr/local/libs/*
> kafka.Kafka /etc/kafka/server.properties
> {code}
> We use java 1.8.0_102, and has applied a TLS patch on reducing
> X509Factory.certCache map size from 750 to 20.
> {code}
> java -version
> java version "1.8.0_102"
> Java(TM) SE Runtime Environment (build 1.8.0_102-b14)
> Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode)
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)