[ https://issues.apache.org/jira/browse/KAFKA-3900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15350909#comment-15350909 ]
Andrey Konyaev commented on KAFKA-3900: --------------------------------------- Any comments? > High CPU util on broker > ----------------------- > > Key: KAFKA-3900 > URL: https://issues.apache.org/jira/browse/KAFKA-3900 > Project: Kafka > Issue Type: Bug > Environment: kafka = 2.11-0.10.0.0 > java version "1.8.0_91" > amazon linux > Reporter: Andrey Konyaev > > I start kafka cluster in amazon with m4.xlarge (4 cpu and 16 GB mem (14 > allocate for kafka in heap)). Have three nodes. > I haven't high load (6000 message/sec) and we have cpu_idle = 70%, but > sometime (about once a day) I see this message in server.log: > [2016-06-24 14:52:22,299] WARN [ReplicaFetcherThread-0-2], Error in fetch > kafka.server.ReplicaFetcherThread$FetchRequest@6eaa1034 > (kafka.server.ReplicaFetcherThread) > java.io.IOException: Connection to 2 was disconnected before the response was > read > at > kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1$$anonfun$apply$1.apply(NetworkClientBlockingOps.scala:87) > at > kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1$$anonfun$apply$1.apply(NetworkClientBlockingOps.scala:84) > at scala.Option.foreach(Option.scala:257) > at > kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1.apply(NetworkClientBlockingOps.scala:84) > at > kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1.apply(NetworkClientBlockingOps.scala:80) > at > kafka.utils.NetworkClientBlockingOps$.recursivePoll$2(NetworkClientBlockingOps.scala:137) > at > kafka.utils.NetworkClientBlockingOps$.kafka$utils$NetworkClientBlockingOps$$pollContinuously$extension(NetworkClientBlockingOps.scala:143) > at > kafka.utils.NetworkClientBlockingOps$.blockingSendAndReceive$extension(NetworkClientBlockingOps.scala:80) > at > kafka.server.ReplicaFetcherThread.sendRequest(ReplicaFetcherThread.scala:244) > at > kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:229) > at > kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:42) > at > kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:107) > at > kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:98) > at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63) > I know, this can be network glitch, but why kafka eat all cpu time? > My config: > inter.broker.protocol.version=0.10.0.0 > log.message.format.version=0.10.0.0 > default.replication.factor=3 > num.partitions=3 > replica.lag.time.max.ms=15000 > broker.id=0 > listeners=PLAINTEXT://:9092 > log.dirs=/mnt/kafka/kafka > log.retention.check.interval.ms=300000 > log.retention.hours=168 > log.segment.bytes=1073741824 > num.io.threads=20 > num.network.threads=10 > num.partitions=1 > num.recovery.threads.per.data.dir=2 > socket.receive.buffer.bytes=102400 > socket.request.max.bytes=104857600 > socket.send.buffer.bytes=102400 > zookeeper.connection.timeout.ms=6000 > delete.topic.enable = true > broker.max_heap_size=10 GiB > > Any ideas? -- This message was sent by Atlassian JIRA (v6.3.4#6332)