Hi, Thanks for the response. We were using version 0.11 previously and all our producers/consumers have been upgraded to either 1.0 or to the latest 2.3.
Is it normal for the network thread to consume more cpu? If you look at it, the network thread consumes 50% of the overall cpu. Regards On Mon, Jan 6, 2020 at 7:04 PM Thunder Stumpges <thunder.stump...@gmail.com> wrote: > Not sure what version your producers/consumers are, or if you upgraded from > a previous version that used to work, or what, but maybe you're hitting > this? > > > https://kafka.apache.org/23/documentation.html#upgrade_10_performance_impact > > > > On Mon, Jan 6, 2020 at 12:48 PM Navneeth Krishnan < > reachnavnee...@gmail.com> > wrote: > > > Hi All, > > > > Any idea on what can be done? Not sure if we are running into this below > > bug. > > > > https://issues.apache.org/jira/browse/KAFKA-7925 > > > > Thanks > > > > On Thu, Jan 2, 2020 at 4:18 PM Navneeth Krishnan < > reachnavnee...@gmail.com> > > wrote: > > > >> Hi All, > >> > >> We have a kafka cluster with 12 nodes and we are pretty much seeing 90% > >> cpu usage on all the nodes. Here is all the information. Need some help > on > >> figuring out what the problem is and how to overcome this issue. > >> > >> *Cluster:* > >> Kafka version: 2.3.0 > >> Number of brokers in cluster: 12 > >> Node type: 4 vCores 32GB mem > >> Network In: 10Mbps per broker > >> Network Out: 16Mbps per broker > >> Topics: 10 (approximately) > >> Partitions: 20 (Max), some has only partitions > >> Replication Factor: 3 > >> > >> *CPU Usage:* > >> [image: image.png] > >> > >> *VMStat* > >> > >> [root]# vmstat 1 10 > >> > >> procs -----------memory---------- ---swap-- -----io---- -system-- > >> ------cpu----- > >> > >> r b swpd free buff cache si so bi bo in cs us sy > >> id wa st > >> > >> 8 0 0 234444 19064 24046980 0 0 17 2026 1 3 38 > 33 > >> 28 0 1 > >> > >> 7 0 0 256444 19036 24023880 0 0 768 0 64027 22708 44 > >> 40 16 0 1 > >> > >> 7 0 0 245356 19052 24034560 0 0 256 472 63509 23276 44 > >> 39 17 0 1 > >> > >> 7 0 0 235096 19052 24046616 0 0 0 0 62277 22516 46 > >> 38 15 0 1 > >> > >> 8 0 0 260548 19036 24020084 0 0 516 49888 62364 22894 43 > >> 38 18 0 1 > >> > >> 5 0 0 249232 19036 24030924 0 0 512 0 61022 24589 41 > >> 39 20 0 1 > >> > >> 6 0 0 238072 19036 24042512 0 0 1024 0 63358 23063 44 > >> 38 17 0 0 > >> > >> 5 0 0 262904 19052 24017972 0 0 0 440 63078 23499 46 > >> 37 17 0 1 > >> > >> 7 0 0 250324 19052 24030008 0 0 0 0 64615 22617 48 > >> 38 14 0 1 > >> > >> 6 0 0 237920 19052 24042372 0 0 1024 48900 63223 23029 42 > >> 40 18 0 1 > >> > >> > >> *IO Stat:* > >> > >> [root]# iostat -m > >> > >> Linux 4.14.72-73.55.amzn2.x86_64 (loc-kafka11.internal.dnaspaces.io) > >> 01/02/2020 _x86_64_ (4 CPU) > >> > >> > >> > >> avg-cpu: %user %nice %system %iowait %steal %idle > >> > >> 38.11 0.00 33.09 0.11 0.61 28.08 > >> > >> > >> > >> Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > >> > >> xvda 2.36 0.01 0.01 26760 43360 > >> > >> nvme0n1 0.00 0.00 0.00 2 0 > >> > >> xvdf 70.95 0.06 7.67 185908 25205338 > >> > >> *Top Kafka broker threads:* > >> [image: image.png] > >> > >> *Top 3:* > >> > >> "data-plane-kafka-network-thread-10-ListenerName(PLAINTEXT)-PLAINTEXT-0" > >> #60 prio=5 os_prio=0 tid=0x00007f8b1ab56000 nid=0x581f runnable > >> [0x00007f8a886ce000] > >> > >> "data-plane-kafka-network-thread-10-ListenerName(PLAINTEXT)-PLAINTEXT-2" > >> #62 prio=5 os_prio=0 tid=0x00007f8b1ab59000 nid=0x5821 runnable > >> [0x00007f8a6aefd000] > >> > >> "data-plane-kafka-network-thread-10-ListenerName(PLAINTEXT)-PLAINTEXT-1" > >> #61 prio=5 os_prio=0 tid=0x00007f8b1ab57800 nid=0x5820 runnable > >> [0x00007f8a885cd000] > >> > >> It doesn't looks like GC and IO is the problem. > >> > >> Thanks > >> > > >