Hi Prakash, How many open files do you expect a broker to be able to handle? It seems like this broker is crashing at around 4100 or so open files.
Thanks, Paul Lung On 6/24/14, 11:08 PM, "Lung, Paul" <pl...@ebay.com> wrote: >Ok. What I just saw was that when the controller machine reaches around >4100+ files, it crashes. Then I think the controller bounced between 2 >other machines, taking them down too, and the circled back to the original >machine. > >Paul Lung > >On 6/24/14, 10:51 PM, "Lung, Paul" <pl...@ebay.com> wrote: > >>The controller machine has 3500 or so, while the other machines have >>around 1600. >> >>Paul Lung >> >>On 6/24/14, 10:31 PM, "Prakash Gowri Shankor" <prakash.shan...@gmail.com> >>wrote: >> >>>How many files does each broker itself have open ? You can find this >>>from >>>'ls -l /proc/<processid>/fd' >>> >>> >>> >>> >>>On Tue, Jun 24, 2014 at 10:18 PM, Lung, Paul <pl...@ebay.com> wrote: >>> >>>> Hi All, >>>> >>>> >>>> I just upgraded my cluster from 0.8.1 to 0.8.1.1. I¹m seeing the >>>>following >>>> error messages on the same 3 brokers once in a while: >>>> >>>> >>>> [2014-06-24 21:43:44,711] ERROR Error in acceptor >>>>(kafka.network.Acceptor) >>>> >>>> java.io.IOException: Too many open files >>>> >>>> at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method) >>>> >>>> at >>>> >>>>sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java: >>>>1 >>>>6 >>>>3) >>>> >>>> at kafka.network.Acceptor.accept(SocketServer.scala:200) >>>> >>>> at kafka.network.Acceptor.run(SocketServer.scala:154) >>>> >>>> at java.lang.Thread.run(Thread.java:679) >>>> >>>> [2014-06-24 21:43:44,711] ERROR Error in acceptor >>>>(kafka.network.Acceptor) >>>> >>>> java.io.IOException: Too many open files >>>> >>>> at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method) >>>> >>>> at >>>> >>>>sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java: >>>>1 >>>>6 >>>>3) >>>> >>>> at kafka.network.Acceptor.accept(SocketServer.scala:200) >>>> >>>> at kafka.network.Acceptor.run(SocketServer.scala:154) >>>> >>>> at java.lang.Thread.run(Thread.java:679) >>>> >>>> When this happens, these 3 brokers essentially go out of sync when you >>>>do >>>> a ³kafka-topics.sh ‹describe². >>>> >>>> I tracked the number of open files by doing ³watch n 1 Œsudo lsof | >>>>wc >>>> l¹², which basically counts all open files on the system. The numbers >>>>for >>>> the systems are basically in the 6000 range, with one system going to >>>>9000. >>>> I presume the 9000 machine is the controller. Looking at the ulimit of >>>>the >>>> user, both the hard limit and the soft limit for open files is >>>>100,000. >>>> Using sysctl, the max file is fs.file-max = 9774928. So we seem to be >>>>way >>>> under the limit. >>>> >>>> What am I missing here? Is there some JVM limit around 10K open files >>>>or >>>> something? >>>> >>>> Paul Lung >>>> >> >