[ https://issues.apache.org/jira/browse/KAFKA-9339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17013400#comment-17013400 ]
Lucas Bradstreet edited comment on KAFKA-9339 at 1/11/20 8:03 AM: ------------------------------------------------------------------ Hi [~jbrownEP], Could you please try running async-profiler ([https://github.com/jvm-profiling-tools/async-profiler])? Running both of these would likely be enough, and attach the files to this ticket. {noformat} ./profiler.sh -d 60 -f allocs.txt -e alloc PID ./profiler.sh -d 60 -f profile.svg PID # if you hit a security issue or end up with an empty svg file, try the itimer fallback ./profiler.sh -d 60 -e itimer -f profile.svg PID {noformat} Thanks! was (Author: lucasbradstreet): Hi [~jbrownEP], Could you please try running async-profiler ([https://github.com/jvm-profiling-tools/async-profiler])? Running both of these would likely be enough, and attach the files to this ticket. ./profiler.sh -d 60 -f allocs.txt -e alloc PID ./profiler.sh -d 60 -f profile.svg PID # if you hit a security issue or end up with an empty svg file, try the itimer fallback ./profiler.sh -d 60 -e itimer -f profile.svg PID Thanks! > Increased CPU utilization in brokers in 2.4.0 > --------------------------------------------- > > Key: KAFKA-9339 > URL: https://issues.apache.org/jira/browse/KAFKA-9339 > Project: Kafka > Issue Type: Bug > Affects Versions: 2.4.0 > Environment: CentOS 6; Java 1.8.0_232 (OpenJDK) > Reporter: James Brown > Priority: Minor > > I upgraded one of my company's test clusters from 2.3.1 to 2.4.0 and have > noticed a significant (40%) increase in the CPU time consumed. This is a > small cluster of three nodes (running on t2.large EC2 instances all in the > same AZ) pushing about 150 message/s in aggregate spread across 208 topics (a > total of 266 partitions; most topics only have one partition). Leadership is > reasonably well-distributed and each node has between 83 and 94 partitions > which it leads. This CPU time increase is visible symmetrically on all three > nodes in the cluster (e.g., the controller isn't using more CPU than the > other nodes). > > The CPU consumption did not return to normal after I did the second restart > to bump the log and inter-broker protocol versions to 2.4, so I don't think > it has anything to do with down-converting to the 2.3 protocols. > > No settings were changed, nor was anything about the JVM changed. There is > nothing interesting being written to the logs. There's no sign of any > instability (partitions aren't being reassigned, etc). > > The best guess I have for the increased CPU usage is that the number of > garbage collections increased by approximately 30%, suggesting that something > is churning a lot more garbage inside Kafka. This is a small cluster, so it's > only got a 3GB heap allocated to Kafka on each node; we're using G1GC with > some light tuning and are on Java 8 if that helps. > > We are only using OpenJDK, so I don't think I can produce a Flight Recorder > profile. > > The kafka-users mailing list suggested this was worth filing a Jira issue > about. -- This message was sent by Atlassian Jira (v8.3.4#803005)