If I run 3 brokers in a cluster on localhost the cpu usage is virtually
zero. Not sure why on other environments the minimum usage of each broker
is at least 13% (with zero producers/consumers), that doesn't sound normal.

On Thu, Mar 23, 2017 at 4:48 PM, Paul van der Linden <p...@sportr.co.uk>
wrote:

> Doesn't seem to be the clients indeed. Maybe it already uses 13% of cpu on
> maintaining the cluster. With no connections at all, except zookeeper and
> the other 2 brokers. This is the cpu usage:
>
> CPU SAMPLES BEGIN (total = 86359) Thu Mar 23 16:47:26 2017
> rank   self  accum   count trace method
>    1 87.23% 87.23%   75327 300920 sun.nio.ch.EPollArrayWrapper.epollWait
>    2 12.48% 99.71%   10780 300518 java.net.PlainSocketImpl.socketAccept
>    3  0.06% 99.77%      51 300940 sun.nio.ch.FileDispatcherImpl.write0
>    4  0.02% 99.79%      20 301559 sun.nio.ch.FileDispatcherImpl.read0
>    5  0.01% 99.80%      10 301567 org.apache.log4j.Category.
> getEffectiveLevel
> CPU SAMPLES END
>
> On Thu, Mar 23, 2017 at 1:02 PM, Jaikiran Pai <jai.forums2...@gmail.com>
> wrote:
>
>> One thing that you might want to check is the number of consumers that
>> are connected/consuming against this Kafka setup. We have consistently
>> noticed that the CPU usage of the broker is very high even with very few
>> consumers (around 10 Java consumers). There's even a JIRA for it. From what
>> I remember, it had to do with the constant hearbeat and other such network
>> activities that happen between these consumers and the broker. We had this
>> issues since 0.8.x days till 0.10.0.1. We just migrated to 0.10.2.0 and we
>> will have to see if it is still reproducible in there.
>>
>> I don't mean to say you are running into the same issue, but you can
>> check that aspect as well (maybe shutdown all consumers and see how the
>> broker CPU behaves).
>>
>> -Jaikiran
>>
>>
>> On Thursday 23 March 2017 06:15 PM, Paul van der Linden wrote:
>>
>>> Thanks. I managed to get a cpu dump from staging.
>>>
>>> The output:
>>> THREAD START (obj=50000427, id = 200004, name="RMI TCP Accept-0",
>>> group="system")
>>> THREAD START (obj=50000427, id = 200001, name="main", group="main")
>>> THREAD START (obj=50000427, id = 200005, name="SensorExpiryThread",
>>> group="main")
>>> THREAD START (obj=500008e6, id = 200006,
>>> name="ThrottledRequestReaper-Fetch", group="main")
>>> THREAD START (obj=500008e6, id = 200007,
>>> name="ThrottledRequestReaper-Produce", group="main")
>>> THREAD START (obj=50000914, id = 200008,
>>> name="ZkClient-EventThread-18-zookeeper:2181", group="main")
>>> THREAD START (obj=500008e6, id = 200009, name="main-SendThread()",
>>> group="main")
>>> THREAD START (obj=50000950, id = 200010, name="main-EventThread",
>>> group="main")
>>> THREAD START (obj=50000427, id = 200011, name="pool-3-thread-1",
>>> group="main")
>>> THREAD END (id = 200011)
>>> THREAD START (obj=50000427, id = 200012,
>>> name="metrics-meter-tick-thread-1", group="main")
>>> THREAD START (obj=50000427, id = 200014, name="kafka-scheduler-0",
>>> group="main")
>>> THREAD START (obj=50000427, id = 200013, name="kafka-scheduler-1",
>>> group="main")
>>> THREAD START (obj=50000427, id = 200015, name="kafka-scheduler-2",
>>> group="main")
>>> THREAD START (obj=50000c33, id = 200016, name="kafka-log-cleaner-thread
>>> -0",
>>> group="main")
>>> THREAD START (obj=50000427, id = 200017,
>>> name="kafka-network-thread-2-PLAINTEXT-0", group="main")
>>> THREAD START (obj=50000427, id = 200018,
>>> name="kafka-network-thread-2-PLAINTEXT-1", group="main")
>>> THREAD START (obj=50000427, id = 200019,
>>> name="kafka-network-thread-2-PLAINTEXT-2", group="main")
>>> THREAD START (obj=50000427, id = 200020,
>>> name="kafka-socket-acceptor-PLAINTEXT-9092", group="main")
>>> THREAD START (obj=500008e6, id = 200021, name="ExpirationReaper-2",
>>> group="main")
>>> THREAD START (obj=500008e6, id = 200022, name="ExpirationReaper-2",
>>> group="main")
>>> THREAD START (obj=50000427, id = 200023,
>>> name="metrics-meter-tick-thread-2", group="main")
>>> THREAD START (obj=50000427, id = 200024, name="kafka-scheduler-3",
>>> group="main")
>>> THREAD START (obj=50000427, id = 200025, name="kafka-scheduler-4",
>>> group="main")
>>> THREAD START (obj=50000427, id = 200026, name="kafka-scheduler-5",
>>> group="main")
>>> THREAD START (obj=50000427, id = 200027, name="kafka-scheduler-6",
>>> group="main")
>>> THREAD START (obj=500008e6, id = 200028, name="ExpirationReaper-2",
>>> group="main")
>>> THREAD START (obj=500008e6, id = 200029, name="ExpirationReaper-2",
>>> group="main")
>>> THREAD START (obj=500008e6, id = 200030, name="ExpirationReaper-2",
>>> group="main")
>>> THREAD START (obj=50000427, id = 200031, name="group-metadata-manager-0
>>> ",
>>> group="main")
>>> THREAD START (obj=50000427, id = 200032, name="kafka-request-handler-0",
>>> group="main")
>>> THREAD START (obj=50000427, id = 200037, name="kafka-request-handler-5",
>>> group="main")
>>> THREAD START (obj=50000427, id = 200036, name="kafka-request-handler-4",
>>> group="main")
>>> THREAD START (obj=50000427, id = 200035, name="kafka-request-handler-3",
>>> group="main")
>>> THREAD START (obj=50000427, id = 200034, name="kafka-request-handler-2",
>>> group="main")
>>> THREAD START (obj=50000427, id = 200033, name="kafka-request-handler-1",
>>> group="main")
>>> THREAD START (obj=50000427, id = 200038, name="kafka-request-handler-6",
>>> group="main")
>>> THREAD START (obj=50000427, id = 200039, name="kafka-request-handler-7",
>>> group="main")
>>> THREAD START (obj=50000427, id = 200040, name="kafka-scheduler-7",
>>> group="main")
>>> THREAD START (obj=50000427, id = 200041, name="kafka-scheduler-8",
>>> group="main")
>>> THREAD START (obj=50000ee2, id = 200042, name="ReplicaFetcherThread-0-0
>>> ",
>>> group="main")
>>> THREAD START (obj=50000ee2, id = 200043, name="ReplicaFetcherThread-0-1
>>> ",
>>> group="main")
>>> THREAD START (obj=50000427, id = 200044, name="kafka-scheduler-9",
>>> group="main")
>>> THREAD START (obj=50000427, id = 200045, name="executor-Fetch",
>>> group="main")
>>> TRACE 300920:
>>> sun.nio.ch.EPollArrayWrapper.epollWait(EPollArrayWrapper.java:Unknown
>>> line)
>>> sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
>>> sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
>>> sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
>>> TRACE 300518:
>>> java.net.PlainSocketImpl.socketAccept(PlainSocketImpl.java:Unknown line)
>>> java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketI
>>> mpl.java:409)
>>> java.net.ServerSocket.implAccept(ServerSocket.java:545)
>>> java.net.ServerSocket.accept(ServerSocket.java:513)
>>> TRACE 300940:
>>> sun.nio.ch.FileDispatcherImpl.write0(FileDispatcherImpl.java:Unknown
>>> line)
>>> sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
>>> sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
>>> sun.nio.ch.IOUtil.write(IOUtil.java:65)
>>> TRACE 301003:
>>> org.xerial.snappy.SnappyNative.rawUncompress(SnappyNative.java:Unknown
>>> line)
>>> org.xerial.snappy.Snappy.rawUncompress(Snappy.java:474)
>>> org.xerial.snappy.Snappy.uncompress(Snappy.java:513)
>>> org.xerial.snappy.SnappyInputStream.readFully(SnappyInputStr
>>> eam.java:147)
>>> TRACE 300979:
>>> sun.nio.ch.FileDispatcherImpl.pread0(FileDispatcherImpl.java:Unknown
>>> line)
>>> sun.nio.ch.FileDispatcherImpl.pread(FileDispatcherImpl.java:52)
>>> sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:220)
>>> sun.nio.ch.IOUtil.read(IOUtil.java:197)
>>> TRACE 301630:
>>> sun.nio.ch.EPollArrayWrapper.epollCtl(EPollArrayWrapper.java:Unknown
>>> line)
>>> sun.nio.ch.EPollArrayWrapper.updateRegistrations(EPollArrayW
>>> rapper.java:299)
>>> sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:268)
>>> sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
>>> TRACE 301259:
>>> sun.misc.Unsafe.unpark(Unsafe.java:Unknown line)
>>> java.util.concurrent.locks.LockSupport.unpark(LockSupport.java:141)
>>> java.util.concurrent.locks.AbstractQueuedSynchronizer.unpark
>>> Successor(AbstractQueuedSynchronizer.java:662)
>>> java.util.concurrent.locks.AbstractQueuedSynchronizer.releas
>>> e(AbstractQueuedSynchronizer.java:1264)
>>> TRACE 301559:
>>> sun.nio.ch.FileDispatcherImpl.read0(FileDispatcherImpl.java:Unknown
>>> line)
>>> sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
>>> sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
>>> sun.nio.ch.IOUtil.read(IOUtil.java:197)
>>> TRACE 300590:
>>> java.lang.ClassLoader.defineClass1(ClassLoader.java:Unknown line)
>>> java.lang.ClassLoader.defineClass(ClassLoader.java:763)
>>> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>>> java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
>>> TRACE 301643:
>>> scala.Tuple2.equals(Tuple2.scala:20)
>>> java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:940)
>>> kafka.utils.Pool.get(Pool.scala:69)
>>> kafka.server.ReplicaManager.getPartition(ReplicaManager.scala:280)
>>> TRACE 300592:
>>> java.util.zip.ZipFile.read(ZipFile.java:Unknown line)
>>> java.util.zip.ZipFile.access$1400(ZipFile.java:60)
>>> java.util.zip.ZipFile$ZipFileInputStream.read(ZipFile.java:717)
>>> java.util.zip.ZipFile$ZipFileInflaterInputStream.fill(ZipFile.java:419)
>>> TRACE 301018:
>>> kafka.utils.CoreUtils$.crc32(CoreUtils.scala:148)
>>> kafka.message.Message.computeChecksum(Message.scala:216)
>>> kafka.message.Message.isValid(Message.scala:226)
>>> kafka.message.Message.ensureValid(Message.scala:232)
>>> TRACE 301561:
>>> java.io.FileDescriptor.sync(FileDescriptor.java:Unknown line)
>>> kafka.server.OffsetCheckpoint.liftedTree1$1(OffsetCheckpoint.scala:62)
>>> kafka.server.OffsetCheckpoint.write(OffsetCheckpoint.scala:49)
>>> kafka.server.ReplicaManager$$anonfun$checkpointHighWatermark
>>> s$2.apply(ReplicaManager.scala:945)
>>> TRACE 301422:
>>> java.util.Arrays.copyOf(Arrays.java:3332)
>>> java.lang.AbstractStringBuilder.expandCapacity(AbstractStrin
>>> gBuilder.java:137)
>>> java.lang.AbstractStringBuilder.ensureCapacityInternal(Abstr
>>> actStringBuilder.java:121)
>>> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:421)
>>> CPU SAMPLES BEGIN (total = 699214) Thu Mar 23 12:41:17 2017
>>> rank   self  accum   count trace method
>>>     1 86.46% 86.46%  604544 300920 sun.nio.ch.EPollArrayWrapper.e
>>> pollWait
>>>     2 12.62% 99.08%   88254 300518 java.net.PlainSocketImpl.socketAccept
>>>     3  0.11% 99.19%     759 300940 sun.nio.ch.FileDispatcherImpl.write0
>>>     4  0.04% 99.23%     253 301003
>>> org.xerial.snappy.SnappyNative.rawUncompress
>>>     5  0.03% 99.26%     231 300979 sun.nio.ch.FileDispatcherImpl.pread0
>>>     6  0.03% 99.29%     220 301630 sun.nio.ch.EPollArrayWrapper.epollCtl
>>>     7  0.03% 99.32%     219 301259 sun.misc.Unsafe.unpark
>>>     8  0.02% 99.34%     145 301559 sun.nio.ch.FileDispatcherImpl.read0
>>>     9  0.01% 99.36%      89 300590 java.lang.ClassLoader.defineClass1
>>>    10  0.01% 99.37%      87 301643 scala.Tuple2.equals
>>>    11  0.01% 99.38%      79 300592 java.util.zip.ZipFile.read
>>>    12  0.01% 99.39%      79 301018 kafka.utils.CoreUtils$.crc32
>>>    13  0.01% 99.40%      78 301561 java.io.FileDescriptor.sync
>>>    14  0.01% 99.41%      72 301422 java.util.Arrays.copyOf
>>> CPU SAMPLES END
>>>
>>> It seems like the constant disconnects is far bigger then the 10 minutes
>>> default. I suspect this has something to do with double connects, which
>>> I'm
>>> not sure to get around.
>>>
>>> On Thu, Mar 23, 2017 at 11:46 AM, Manikumar <manikumar.re...@gmail.com>
>>> wrote:
>>>
>>> 1. may be you can monitor thread wise cpu usage and correlate with thread
>>>> dump
>>>>      to identify the bottleneck
>>>> 2. Broker config property connections.max.idle.ms is used to close
>>>> idle connections.
>>>>      default is 10min.
>>>>
>>>> On Thu, Mar 23, 2017 at 3:55 PM, Paul van der Linden <p...@sportr.co.uk
>>>> >
>>>> wrote:
>>>>
>>>> Hi,
>>>>>
>>>>> I deployed Kafka about a week ago, but there are a few problems with
>>>>> how
>>>>> Kafka behaves.
>>>>> The first is the surprisingly high resource usage, one this the memory
>>>>> (1.5-2 GB for each broker, 3 brokers) although this might be normal.
>>>>> The
>>>>> other one is the cpu usage, which starts with 20% minimum on each
>>>>> broker,
>>>>> which I find strange with the current throughput (which is < 1 msg/s).
>>>>>
>>>>> This might has something to do with something else which I find
>>>>> strange,
>>>>> Kafka disconnects clients about every 10-20 minutes per broker. It
>>>>> might
>>>>> have something to do with the configuration: Deployed in kubernetes,
>>>>> bootstrapping with a single dns name (which is backed by all alive
>>>>> kafka
>>>>> brokers), and then every broker has a separate dns address which is
>>>>> used
>>>>> after the bootstrap. This means that a client is connected twice to one
>>>>>
>>>> of
>>>>
>>>>> the brokers. The reason for the bootstrap dns name is to make sure I
>>>>>
>>>> don't
>>>>
>>>>> have to update all clients to include other brokers.
>>>>>
>>>>> Any advice on how to solve these 2 problems?
>>>>>
>>>>> Thanks,
>>>>> Paul
>>>>>
>>>>> On Tue, Mar 21, 2017 at 10:30 AM, Paul van der Linden <
>>>>> p...@sportr.co.uk
>>>>>
>>>>> wrote:
>>>>>
>>>>> Hi,
>>>>>>
>>>>>> I deployed Kafka about a week ago, but there are a few problems with
>>>>>>
>>>>> how
>>>>
>>>>> Kafka behaves.
>>>>>> The first is the surprisingly high resource usage, one this the memory
>>>>>> (1.5-2 GB for each broker, 3 brokers) although this might be normal.
>>>>>>
>>>>> The
>>>>
>>>>> other one is the cpu usage, which starts with 20% minimum on each
>>>>>>
>>>>> broker,
>>>>
>>>>> which I find strange with the current throughput (which is < 1 msg/s).
>>>>>>
>>>>>> This might has something to do with something else which I find
>>>>>>
>>>>> strange,
>>>>
>>>>> Kafka disconnects clients about every 10-20 minutes per broker. It
>>>>>>
>>>>> might
>>>>
>>>>> have something to do with the configuration: Deployed in kubernetes,
>>>>>> bootstrapping with a single dns name (which is backed by all alive
>>>>>>
>>>>> kafka
>>>>
>>>>> brokers), and then every broker has a separate dns address which is
>>>>>>
>>>>> used
>>>>
>>>>> after the bootstrap. This means that a client is connected twice to one
>>>>>>
>>>>> of
>>>>>
>>>>>> the brokers. The reason for the bootstrap dns name is to make sure I
>>>>>>
>>>>> don't
>>>>>
>>>>>> have to update all clients to include other brokers.
>>>>>>
>>>>>> Any advice on how to solve these 2 problems?
>>>>>>
>>>>>> Thanks,
>>>>>> Paul
>>>>>>
>>>>>>
>>
>

Reply via email to