If I run 3 brokers in a cluster on localhost the cpu usage is virtually zero. Not sure why on other environments the minimum usage of each broker is at least 13% (with zero producers/consumers), that doesn't sound normal.
On Thu, Mar 23, 2017 at 4:48 PM, Paul van der Linden <p...@sportr.co.uk> wrote: > Doesn't seem to be the clients indeed. Maybe it already uses 13% of cpu on > maintaining the cluster. With no connections at all, except zookeeper and > the other 2 brokers. This is the cpu usage: > > CPU SAMPLES BEGIN (total = 86359) Thu Mar 23 16:47:26 2017 > rank self accum count trace method > 1 87.23% 87.23% 75327 300920 sun.nio.ch.EPollArrayWrapper.epollWait > 2 12.48% 99.71% 10780 300518 java.net.PlainSocketImpl.socketAccept > 3 0.06% 99.77% 51 300940 sun.nio.ch.FileDispatcherImpl.write0 > 4 0.02% 99.79% 20 301559 sun.nio.ch.FileDispatcherImpl.read0 > 5 0.01% 99.80% 10 301567 org.apache.log4j.Category. > getEffectiveLevel > CPU SAMPLES END > > On Thu, Mar 23, 2017 at 1:02 PM, Jaikiran Pai <jai.forums2...@gmail.com> > wrote: > >> One thing that you might want to check is the number of consumers that >> are connected/consuming against this Kafka setup. We have consistently >> noticed that the CPU usage of the broker is very high even with very few >> consumers (around 10 Java consumers). There's even a JIRA for it. From what >> I remember, it had to do with the constant hearbeat and other such network >> activities that happen between these consumers and the broker. We had this >> issues since 0.8.x days till 0.10.0.1. We just migrated to 0.10.2.0 and we >> will have to see if it is still reproducible in there. >> >> I don't mean to say you are running into the same issue, but you can >> check that aspect as well (maybe shutdown all consumers and see how the >> broker CPU behaves). >> >> -Jaikiran >> >> >> On Thursday 23 March 2017 06:15 PM, Paul van der Linden wrote: >> >>> Thanks. I managed to get a cpu dump from staging. >>> >>> The output: >>> THREAD START (obj=50000427, id = 200004, name="RMI TCP Accept-0", >>> group="system") >>> THREAD START (obj=50000427, id = 200001, name="main", group="main") >>> THREAD START (obj=50000427, id = 200005, name="SensorExpiryThread", >>> group="main") >>> THREAD START (obj=500008e6, id = 200006, >>> name="ThrottledRequestReaper-Fetch", group="main") >>> THREAD START (obj=500008e6, id = 200007, >>> name="ThrottledRequestReaper-Produce", group="main") >>> THREAD START (obj=50000914, id = 200008, >>> name="ZkClient-EventThread-18-zookeeper:2181", group="main") >>> THREAD START (obj=500008e6, id = 200009, name="main-SendThread()", >>> group="main") >>> THREAD START (obj=50000950, id = 200010, name="main-EventThread", >>> group="main") >>> THREAD START (obj=50000427, id = 200011, name="pool-3-thread-1", >>> group="main") >>> THREAD END (id = 200011) >>> THREAD START (obj=50000427, id = 200012, >>> name="metrics-meter-tick-thread-1", group="main") >>> THREAD START (obj=50000427, id = 200014, name="kafka-scheduler-0", >>> group="main") >>> THREAD START (obj=50000427, id = 200013, name="kafka-scheduler-1", >>> group="main") >>> THREAD START (obj=50000427, id = 200015, name="kafka-scheduler-2", >>> group="main") >>> THREAD START (obj=50000c33, id = 200016, name="kafka-log-cleaner-thread >>> -0", >>> group="main") >>> THREAD START (obj=50000427, id = 200017, >>> name="kafka-network-thread-2-PLAINTEXT-0", group="main") >>> THREAD START (obj=50000427, id = 200018, >>> name="kafka-network-thread-2-PLAINTEXT-1", group="main") >>> THREAD START (obj=50000427, id = 200019, >>> name="kafka-network-thread-2-PLAINTEXT-2", group="main") >>> THREAD START (obj=50000427, id = 200020, >>> name="kafka-socket-acceptor-PLAINTEXT-9092", group="main") >>> THREAD START (obj=500008e6, id = 200021, name="ExpirationReaper-2", >>> group="main") >>> THREAD START (obj=500008e6, id = 200022, name="ExpirationReaper-2", >>> group="main") >>> THREAD START (obj=50000427, id = 200023, >>> name="metrics-meter-tick-thread-2", group="main") >>> THREAD START (obj=50000427, id = 200024, name="kafka-scheduler-3", >>> group="main") >>> THREAD START (obj=50000427, id = 200025, name="kafka-scheduler-4", >>> group="main") >>> THREAD START (obj=50000427, id = 200026, name="kafka-scheduler-5", >>> group="main") >>> THREAD START (obj=50000427, id = 200027, name="kafka-scheduler-6", >>> group="main") >>> THREAD START (obj=500008e6, id = 200028, name="ExpirationReaper-2", >>> group="main") >>> THREAD START (obj=500008e6, id = 200029, name="ExpirationReaper-2", >>> group="main") >>> THREAD START (obj=500008e6, id = 200030, name="ExpirationReaper-2", >>> group="main") >>> THREAD START (obj=50000427, id = 200031, name="group-metadata-manager-0 >>> ", >>> group="main") >>> THREAD START (obj=50000427, id = 200032, name="kafka-request-handler-0", >>> group="main") >>> THREAD START (obj=50000427, id = 200037, name="kafka-request-handler-5", >>> group="main") >>> THREAD START (obj=50000427, id = 200036, name="kafka-request-handler-4", >>> group="main") >>> THREAD START (obj=50000427, id = 200035, name="kafka-request-handler-3", >>> group="main") >>> THREAD START (obj=50000427, id = 200034, name="kafka-request-handler-2", >>> group="main") >>> THREAD START (obj=50000427, id = 200033, name="kafka-request-handler-1", >>> group="main") >>> THREAD START (obj=50000427, id = 200038, name="kafka-request-handler-6", >>> group="main") >>> THREAD START (obj=50000427, id = 200039, name="kafka-request-handler-7", >>> group="main") >>> THREAD START (obj=50000427, id = 200040, name="kafka-scheduler-7", >>> group="main") >>> THREAD START (obj=50000427, id = 200041, name="kafka-scheduler-8", >>> group="main") >>> THREAD START (obj=50000ee2, id = 200042, name="ReplicaFetcherThread-0-0 >>> ", >>> group="main") >>> THREAD START (obj=50000ee2, id = 200043, name="ReplicaFetcherThread-0-1 >>> ", >>> group="main") >>> THREAD START (obj=50000427, id = 200044, name="kafka-scheduler-9", >>> group="main") >>> THREAD START (obj=50000427, id = 200045, name="executor-Fetch", >>> group="main") >>> TRACE 300920: >>> sun.nio.ch.EPollArrayWrapper.epollWait(EPollArrayWrapper.java:Unknown >>> line) >>> sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269) >>> sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93) >>> sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) >>> TRACE 300518: >>> java.net.PlainSocketImpl.socketAccept(PlainSocketImpl.java:Unknown line) >>> java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketI >>> mpl.java:409) >>> java.net.ServerSocket.implAccept(ServerSocket.java:545) >>> java.net.ServerSocket.accept(ServerSocket.java:513) >>> TRACE 300940: >>> sun.nio.ch.FileDispatcherImpl.write0(FileDispatcherImpl.java:Unknown >>> line) >>> sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) >>> sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) >>> sun.nio.ch.IOUtil.write(IOUtil.java:65) >>> TRACE 301003: >>> org.xerial.snappy.SnappyNative.rawUncompress(SnappyNative.java:Unknown >>> line) >>> org.xerial.snappy.Snappy.rawUncompress(Snappy.java:474) >>> org.xerial.snappy.Snappy.uncompress(Snappy.java:513) >>> org.xerial.snappy.SnappyInputStream.readFully(SnappyInputStr >>> eam.java:147) >>> TRACE 300979: >>> sun.nio.ch.FileDispatcherImpl.pread0(FileDispatcherImpl.java:Unknown >>> line) >>> sun.nio.ch.FileDispatcherImpl.pread(FileDispatcherImpl.java:52) >>> sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:220) >>> sun.nio.ch.IOUtil.read(IOUtil.java:197) >>> TRACE 301630: >>> sun.nio.ch.EPollArrayWrapper.epollCtl(EPollArrayWrapper.java:Unknown >>> line) >>> sun.nio.ch.EPollArrayWrapper.updateRegistrations(EPollArrayW >>> rapper.java:299) >>> sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:268) >>> sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93) >>> TRACE 301259: >>> sun.misc.Unsafe.unpark(Unsafe.java:Unknown line) >>> java.util.concurrent.locks.LockSupport.unpark(LockSupport.java:141) >>> java.util.concurrent.locks.AbstractQueuedSynchronizer.unpark >>> Successor(AbstractQueuedSynchronizer.java:662) >>> java.util.concurrent.locks.AbstractQueuedSynchronizer.releas >>> e(AbstractQueuedSynchronizer.java:1264) >>> TRACE 301559: >>> sun.nio.ch.FileDispatcherImpl.read0(FileDispatcherImpl.java:Unknown >>> line) >>> sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) >>> sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) >>> sun.nio.ch.IOUtil.read(IOUtil.java:197) >>> TRACE 300590: >>> java.lang.ClassLoader.defineClass1(ClassLoader.java:Unknown line) >>> java.lang.ClassLoader.defineClass(ClassLoader.java:763) >>> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) >>> java.net.URLClassLoader.defineClass(URLClassLoader.java:467) >>> TRACE 301643: >>> scala.Tuple2.equals(Tuple2.scala:20) >>> java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:940) >>> kafka.utils.Pool.get(Pool.scala:69) >>> kafka.server.ReplicaManager.getPartition(ReplicaManager.scala:280) >>> TRACE 300592: >>> java.util.zip.ZipFile.read(ZipFile.java:Unknown line) >>> java.util.zip.ZipFile.access$1400(ZipFile.java:60) >>> java.util.zip.ZipFile$ZipFileInputStream.read(ZipFile.java:717) >>> java.util.zip.ZipFile$ZipFileInflaterInputStream.fill(ZipFile.java:419) >>> TRACE 301018: >>> kafka.utils.CoreUtils$.crc32(CoreUtils.scala:148) >>> kafka.message.Message.computeChecksum(Message.scala:216) >>> kafka.message.Message.isValid(Message.scala:226) >>> kafka.message.Message.ensureValid(Message.scala:232) >>> TRACE 301561: >>> java.io.FileDescriptor.sync(FileDescriptor.java:Unknown line) >>> kafka.server.OffsetCheckpoint.liftedTree1$1(OffsetCheckpoint.scala:62) >>> kafka.server.OffsetCheckpoint.write(OffsetCheckpoint.scala:49) >>> kafka.server.ReplicaManager$$anonfun$checkpointHighWatermark >>> s$2.apply(ReplicaManager.scala:945) >>> TRACE 301422: >>> java.util.Arrays.copyOf(Arrays.java:3332) >>> java.lang.AbstractStringBuilder.expandCapacity(AbstractStrin >>> gBuilder.java:137) >>> java.lang.AbstractStringBuilder.ensureCapacityInternal(Abstr >>> actStringBuilder.java:121) >>> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:421) >>> CPU SAMPLES BEGIN (total = 699214) Thu Mar 23 12:41:17 2017 >>> rank self accum count trace method >>> 1 86.46% 86.46% 604544 300920 sun.nio.ch.EPollArrayWrapper.e >>> pollWait >>> 2 12.62% 99.08% 88254 300518 java.net.PlainSocketImpl.socketAccept >>> 3 0.11% 99.19% 759 300940 sun.nio.ch.FileDispatcherImpl.write0 >>> 4 0.04% 99.23% 253 301003 >>> org.xerial.snappy.SnappyNative.rawUncompress >>> 5 0.03% 99.26% 231 300979 sun.nio.ch.FileDispatcherImpl.pread0 >>> 6 0.03% 99.29% 220 301630 sun.nio.ch.EPollArrayWrapper.epollCtl >>> 7 0.03% 99.32% 219 301259 sun.misc.Unsafe.unpark >>> 8 0.02% 99.34% 145 301559 sun.nio.ch.FileDispatcherImpl.read0 >>> 9 0.01% 99.36% 89 300590 java.lang.ClassLoader.defineClass1 >>> 10 0.01% 99.37% 87 301643 scala.Tuple2.equals >>> 11 0.01% 99.38% 79 300592 java.util.zip.ZipFile.read >>> 12 0.01% 99.39% 79 301018 kafka.utils.CoreUtils$.crc32 >>> 13 0.01% 99.40% 78 301561 java.io.FileDescriptor.sync >>> 14 0.01% 99.41% 72 301422 java.util.Arrays.copyOf >>> CPU SAMPLES END >>> >>> It seems like the constant disconnects is far bigger then the 10 minutes >>> default. I suspect this has something to do with double connects, which >>> I'm >>> not sure to get around. >>> >>> On Thu, Mar 23, 2017 at 11:46 AM, Manikumar <manikumar.re...@gmail.com> >>> wrote: >>> >>> 1. may be you can monitor thread wise cpu usage and correlate with thread >>>> dump >>>> to identify the bottleneck >>>> 2. Broker config property connections.max.idle.ms is used to close >>>> idle connections. >>>> default is 10min. >>>> >>>> On Thu, Mar 23, 2017 at 3:55 PM, Paul van der Linden <p...@sportr.co.uk >>>> > >>>> wrote: >>>> >>>> Hi, >>>>> >>>>> I deployed Kafka about a week ago, but there are a few problems with >>>>> how >>>>> Kafka behaves. >>>>> The first is the surprisingly high resource usage, one this the memory >>>>> (1.5-2 GB for each broker, 3 brokers) although this might be normal. >>>>> The >>>>> other one is the cpu usage, which starts with 20% minimum on each >>>>> broker, >>>>> which I find strange with the current throughput (which is < 1 msg/s). >>>>> >>>>> This might has something to do with something else which I find >>>>> strange, >>>>> Kafka disconnects clients about every 10-20 minutes per broker. It >>>>> might >>>>> have something to do with the configuration: Deployed in kubernetes, >>>>> bootstrapping with a single dns name (which is backed by all alive >>>>> kafka >>>>> brokers), and then every broker has a separate dns address which is >>>>> used >>>>> after the bootstrap. This means that a client is connected twice to one >>>>> >>>> of >>>> >>>>> the brokers. The reason for the bootstrap dns name is to make sure I >>>>> >>>> don't >>>> >>>>> have to update all clients to include other brokers. >>>>> >>>>> Any advice on how to solve these 2 problems? >>>>> >>>>> Thanks, >>>>> Paul >>>>> >>>>> On Tue, Mar 21, 2017 at 10:30 AM, Paul van der Linden < >>>>> p...@sportr.co.uk >>>>> >>>>> wrote: >>>>> >>>>> Hi, >>>>>> >>>>>> I deployed Kafka about a week ago, but there are a few problems with >>>>>> >>>>> how >>>> >>>>> Kafka behaves. >>>>>> The first is the surprisingly high resource usage, one this the memory >>>>>> (1.5-2 GB for each broker, 3 brokers) although this might be normal. >>>>>> >>>>> The >>>> >>>>> other one is the cpu usage, which starts with 20% minimum on each >>>>>> >>>>> broker, >>>> >>>>> which I find strange with the current throughput (which is < 1 msg/s). >>>>>> >>>>>> This might has something to do with something else which I find >>>>>> >>>>> strange, >>>> >>>>> Kafka disconnects clients about every 10-20 minutes per broker. It >>>>>> >>>>> might >>>> >>>>> have something to do with the configuration: Deployed in kubernetes, >>>>>> bootstrapping with a single dns name (which is backed by all alive >>>>>> >>>>> kafka >>>> >>>>> brokers), and then every broker has a separate dns address which is >>>>>> >>>>> used >>>> >>>>> after the bootstrap. This means that a client is connected twice to one >>>>>> >>>>> of >>>>> >>>>>> the brokers. The reason for the bootstrap dns name is to make sure I >>>>>> >>>>> don't >>>>> >>>>>> have to update all clients to include other brokers. >>>>>> >>>>>> Any advice on how to solve these 2 problems? >>>>>> >>>>>> Thanks, >>>>>> Paul >>>>>> >>>>>> >> >