Andrey Konyaev created KAFKA-3900: ------------------------------------- Summary: High CPU util on broker Key: KAFKA-3900 URL: https://issues.apache.org/jira/browse/KAFKA-3900 Project: Kafka Issue Type: Bug Environment: kafka = 2.11-0.10.0.0 java version "1.8.0_91" amazon linux Reporter: Andrey Konyaev
I start kafka cluster in amazon with m4.xlarge (4 cpu and 16 GB mem (14 allocate for kafka in heap)). Have three nodes. I haven't high load (6000 message/sec) and we have cpu_idle = 70%, but sometime (about once a day) I see this message in server.log: [2016-06-24 14:52:22,299] WARN [ReplicaFetcherThread-0-2], Error in fetch kafka.server.ReplicaFetcherThread$FetchRequest@6eaa1034 (kafka.server.ReplicaFetcherThread) java.io.IOException: Connection to 2 was disconnected before the response was read at kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1$$anonfun$apply$1.apply(NetworkClientBlockingOps.scala:87) at kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1$$anonfun$apply$1.apply(NetworkClientBlockingOps.scala:84) at scala.Option.foreach(Option.scala:257) at kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1.apply(NetworkClientBlockingOps.scala:84) at kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1.apply(NetworkClientBlockingOps.scala:80) at kafka.utils.NetworkClientBlockingOps$.recursivePoll$2(NetworkClientBlockingOps.scala:137) at kafka.utils.NetworkClientBlockingOps$.kafka$utils$NetworkClientBlockingOps$$pollContinuously$extension(NetworkClientBlockingOps.scala:143) at kafka.utils.NetworkClientBlockingOps$.blockingSendAndReceive$extension(NetworkClientBlockingOps.scala:80) at kafka.server.ReplicaFetcherThread.sendRequest(ReplicaFetcherThread.scala:244) at kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:229) at kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:42) at kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:107) at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:98) at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63) I know, this can be network glitch, but why kafka eat all cpu time? My config: inter.broker.protocol.version=0.10.0.0 log.message.format.version=0.10.0.0 default.replication.factor=3 num.partitions=3 replica.lag.time.max.ms=15000 broker.id=0 listeners=PLAINTEXT://:9092 log.dirs=/mnt/kafka/kafka log.retention.check.interval.ms=300000 log.retention.hours=168 log.segment.bytes=1073741824 num.io.threads=20 num.network.threads=10 num.partitions=1 num.recovery.threads.per.data.dir=2 socket.receive.buffer.bytes=102400 socket.request.max.bytes=104857600 socket.send.buffer.bytes=102400 zookeeper.connection.timeout.ms=6000 delete.topic.enable = true broker.max_heap_size=10 GiB Any ideas? -- This message was sent by Atlassian JIRA (v6.3.4#6332)