[ https://issues.apache.org/jira/browse/KAFKA-6626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16391839#comment-16391839 ]
Maciej Bryński commented on KAFKA-6626: --------------------------------------- I think even standard java HashMap can be used. The only thing I wonder is if we need to use IdentityHashMap (comparing records by instance, not by equals) [~ewencp] ? > Performance bottleneck in Kafka Connect sendRecords > --------------------------------------------------- > > Key: KAFKA-6626 > URL: https://issues.apache.org/jira/browse/KAFKA-6626 > Project: Kafka > Issue Type: Bug > Affects Versions: 1.0.0 > Reporter: Maciej Bryński > Priority: Major > Attachments: MapPerf.java, image-2018-03-08-08-35-19-247.png > > > Kafka Connect is using IdentityHashMap for storing records. > [https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSourceTask.java#L239] > Unfortunately this solution is very slow (2-4 times slower than normal > HashMap / HashSet). > Benchmark result (code in attachment). > {code:java} > Identity 4220 > Set 2115 > Map 1941 > Fast Set 2121 > {code} > Things are even worse when using default GC configuration > (-server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 > -XX:InitiatingHeapOccupancyPercent=35 -Djava.awt.headless=true) > {code:java} > Identity 7885 > Set 2364 > Map 1548 > Fast Set 1520 > {code} > Java version > {code:java} > java version "1.8.0_152" > Java(TM) SE Runtime Environment (build 1.8.0_152-b16) > Java HotSpot(TM) 64-Bit Server VM (build 25.152-b16, mixed mode) > {code} > This problem is greatly slowing Kafka Connect. > !image-2018-03-08-08-35-19-247.png! > -- This message was sent by Atlassian JIRA (v7.6.3#76005)