[ https://issues.apache.org/jira/browse/KAFKA-7012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16509927#comment-16509927 ]
rajadayalan perumalsamy edited comment on KAFKA-7012 at 6/12/18 6:07 PM: ------------------------------------------------------------------------- Tried out in a different cluster 20 nodes(million msgs/sec), upgraded 5 nodes to 1.0.1 with the patched jar. can confirm that it is not affecting function/Missing msgs and latency also looks fine. was (Author: rajadayalanvdms): Tried out the patched jar in a different cluster 20 nodes(million msgs/sec), upgraded 5 nodes to 1.0.1 with the patched jar. can confirm that it is not affecting function/Missing msgs and latency also looks fine. > Performance issue upgrading to kafka 1.0.1 or 1.1 > ------------------------------------------------- > > Key: KAFKA-7012 > URL: https://issues.apache.org/jira/browse/KAFKA-7012 > Project: Kafka > Issue Type: Bug > Affects Versions: 1.1.0, 1.0.1 > Reporter: rajadayalan perumalsamy > Assignee: praveen > Priority: Major > Attachments: Commit-47ee8e954-0607-bufferkeys-nopoll-profile.png, > Commit-47ee8e954-0607-memory.png, Commit-47ee8e954-0607-profile.png, > Commit-47ee8e954-profile.png, Commit-47ee8e954-profile2.png, > Commit-f15cdbc91b-profile.png, Commit-f15cdbc91b-profile2.png > > > We are trying to upgrade kafka cluster from Kafka 0.11.0.1 to Kafka 1.0.1. > After upgrading 1 node on the cluster, we notice that network threads use > most of the cpu. It is a 3 node cluster with 15k messages/sec on each node. > With Kafka 0.11.0.1 typical usage of the servers is around 50 to 60% > vcpu(using less than 1 vcpu). After upgrade we are noticing that cpu usage is > high depending on the number of network threads used. If networks threads is > set to 8, then the cpu usage is around 850%(9 vcpus) and if it is set to 4 > then the cpu usage is around 450%(5 vcpus). Using the same kafka > server.properties for both. > Did further analysis with git bisect, couple of build and deploys, traced the > issue to commit 47ee8e954df62b9a79099e944ec4be29afe046f6. CPU usage is fine > for commit f15cdbc91b240e656d9a2aeb6877e94624b21f8d. But with commit > 47ee8e954df62b9a79099e944ec4be29afe046f6 cpu usage has increased. Have > attached screenshots of profiling done with both the commits. Screenshot > Commit-f15cdbc91b-profile shows less cpu usage by network threads and > Screenshots Commit-47ee8e954-profile and Commit-47ee8e954-profile2 show > higher cpu usage(almost entire cpu usage) by network threads. Also noticed > that kafka.network.Processor.poll() method is invoked 10 times more with > commit 47ee8e954df62b9a79099e944ec4be29afe046f6. > We need the issue to be resolved to upgrade the cluster. Please let me know > if you need any additional information. -- This message was sent by Atlassian JIRA (v7.6.3#76005)