Does your test program run as the same user as Kafka broker? Thanks,
Jun On Tue, Jul 8, 2014 at 1:42 PM, Lung, Paul <pl...@ebay.com> wrote: > Hi Guys, > > I’m seeing the following errors from the 0.8.1.1 broker. This occurs most > often on the Controller machine. Then the controller process crashes, and > the controller bounces to other machines, which causes those machines to > crash. Looking at the file descriptors being held by the process, it’s only > around 4000 or so(looking at . There aren’t a whole lot of connections in > TIME_WAIT states, and I’ve increased the ephemeral port range to “16000 – > 64000” via "/proc/sys/net/ipv4/ip_local_port_range”. I’ve written a Java > test program to see how many sockets and files I can open. The socket is > definitely limited by the ephemeral port range, which was around 22K at the > time. But I > can open tons of files, since the open file limit of the user is set to > 100K. > > So given that I can theoretically open 48K sockets and probably 90K files, > and I only see around 4K total for the Kafka broker, I’m really confused as > to why I’m seeing this error. Is there some internal Kafka limit that I > don’t know about? > > Paul Lung > > > > java.io.IOException: Too many open files > > at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method) > > at > sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:163) > > at kafka.network.Acceptor.accept(SocketServer.scala:200) > > at kafka.network.Acceptor.run(SocketServer.scala:154) > > at java.lang.Thread.run(Thread.java:679) > > [2014-07-08 13:07:21,534] ERROR Error in acceptor (kafka.network.Acceptor) > > java.io.IOException: Too many open files > > at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method) > > at > sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:163) > > at kafka.network.Acceptor.accept(SocketServer.scala:200) > > at kafka.network.Acceptor.run(SocketServer.scala:154) > > at java.lang.Thread.run(Thread.java:679) > > [2014-07-08 13:07:21,563] ERROR [ReplicaFetcherThread-3-2124488], Error > for partition [bom__021____active_80__32__mini____activeitem_lvs_qn,0] to > broker 2124488:class kafka.common.NotLeaderForPartitionException > (kafka.server.ReplicaFetcherThread) > > [2014-07-08 13:07:21,558] FATAL [Replica Manager on Broker 2140112]: Error > writing to highwatermark file: (kafka.server.ReplicaManager) > > java.io.FileNotFoundException: > /ebay/cronus/software/cronusapp_home/kafka/kafka-logs/replication-offset-checkpoint.tmp > (Too many open files) > > at java.io.FileOutputStream.open(Native Method) > > at java.io.FileOutputStream.<init>(FileOutputStream.java:209) > > at java.io.FileOutputStream.<init>(FileOutputStream.java:160) > > at java.io.FileWriter.<init>(FileWriter.java:90) > > at kafka.server.OffsetCheckpoint.write(OffsetCheckpoint.scala:37) > > at > kafka.server.ReplicaManager$$anonfun$checkpointHighWatermarks$2.apply(ReplicaManager.scala:447) > > at > kafka.server.ReplicaManager$$anonfun$checkpointHighWatermarks$2.apply(ReplicaManager.scala:444) > > at > scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772) > > at scala.collection.immutable.Map$Map1.foreach(Map.scala:109) > > at > scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771) > > at > kafka.server.ReplicaManager.checkpointHighWatermarks(ReplicaManager.scala:444) > > at > kafka.server.ReplicaManager$$anonfun$1.apply$mcV$sp(ReplicaManager.scala:94) > > at kafka.utils.KafkaScheduler$$anon$1.run(KafkaScheduler.scala:100) > > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > > at > java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351) > > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178) > > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:165) > > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:267) > > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) > > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) > > at java.lang.Thread.run(Thread.java:679) > > > >