Pedro Gontijo created KAFKA-7757: ------------------------------------ Summary: Too many open files after java.io.IOException: Connection to n was disconnected before the response was read Key: KAFKA-7757 URL: https://issues.apache.org/jira/browse/KAFKA-7757 Project: Kafka Issue Type: Bug Components: core Affects Versions: 2.1.0 Reporter: Pedro Gontijo
We upgraded from 0.10.2.2 to 2.1.0 (a cluster with 3 brokers) After a while (hours) 2 brokers start to throw: {code:java} java.io.IOException: Connection to NN was disconnected before the response was read at org.apache.kafka.clients.NetworkClientUtils.sendAndReceive(NetworkClientUtils.java:97) at kafka.server.ReplicaFetcherBlockingSend.sendRequest(ReplicaFetcherBlockingSend.scala:97) at kafka.server.ReplicaFetcherThread.fetchFromLeader(ReplicaFetcherThread.scala:190) at kafka.server.AbstractFetcherThread.kafka$server$AbstractFetcherThread$$processFetchRequest(AbstractFetcherThread.scala:241) at kafka.server.AbstractFetcherThread$$anonfun$maybeFetch$1.apply(AbstractFetcherThread.scala:130) at kafka.server.AbstractFetcherThread$$anonfun$maybeFetch$1.apply(AbstractFetcherThread.scala:129) at scala.Option.foreach(Option.scala:257) at kafka.server.AbstractFetcherThread.maybeFetch(AbstractFetcherThread.scala:129) at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:111) at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:82) {code} The problem has happened with all brokers. File descriptors start to pile up and if I do not restart it throws "Too many open files" and crashes. {code:java} ERROR Error while accepting connection (kafka.network.Acceptor) java.io.IOException: Too many open files in system at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method) at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422) at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250) at kafka.network.Acceptor.accept(SocketServer.scala:460) at kafka.network.Acceptor.run(SocketServer.scala:403) at java.lang.Thread.run(Thread.java:748) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)