Hi Prakash,

How many open files do you expect a broker to be able to handle? It seems
like this broker is crashing at around 4100 or so open files.

Thanks,
Paul Lung

On 6/24/14, 11:08 PM, "Lung, Paul" <pl...@ebay.com> wrote:

>Ok. What I just saw was that when the controller machine reaches around
>4100+ files, it crashes. Then I think the controller bounced between 2
>other machines, taking them down too, and the circled back to the original
>machine.
>
>Paul Lung
>
>On 6/24/14, 10:51 PM, "Lung, Paul" <pl...@ebay.com> wrote:
>
>>The controller machine has 3500 or so, while the other machines have
>>around 1600.
>>
>>Paul Lung
>>
>>On 6/24/14, 10:31 PM, "Prakash Gowri Shankor" <prakash.shan...@gmail.com>
>>wrote:
>>
>>>How many files does each broker itself have open ? You can find this
>>>from
>>>'ls -l /proc/<processid>/fd'
>>>
>>>
>>>
>>>
>>>On Tue, Jun 24, 2014 at 10:18 PM, Lung, Paul <pl...@ebay.com> wrote:
>>>
>>>> Hi All,
>>>>
>>>>
>>>> I just upgraded my cluster from 0.8.1 to 0.8.1.1. I¹m seeing the
>>>>following
>>>> error messages on the same 3 brokers once in a while:
>>>>
>>>>
>>>> [2014-06-24 21:43:44,711] ERROR Error in acceptor
>>>>(kafka.network.Acceptor)
>>>>
>>>> java.io.IOException: Too many open files
>>>>
>>>>         at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
>>>>
>>>>         at
>>>> 
>>>>sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:
>>>>1
>>>>6
>>>>3)
>>>>
>>>>         at kafka.network.Acceptor.accept(SocketServer.scala:200)
>>>>
>>>>         at kafka.network.Acceptor.run(SocketServer.scala:154)
>>>>
>>>>         at java.lang.Thread.run(Thread.java:679)
>>>>
>>>> [2014-06-24 21:43:44,711] ERROR Error in acceptor
>>>>(kafka.network.Acceptor)
>>>>
>>>> java.io.IOException: Too many open files
>>>>
>>>>         at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
>>>>
>>>>         at
>>>> 
>>>>sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:
>>>>1
>>>>6
>>>>3)
>>>>
>>>>         at kafka.network.Acceptor.accept(SocketServer.scala:200)
>>>>
>>>>         at kafka.network.Acceptor.run(SocketServer.scala:154)
>>>>
>>>>         at java.lang.Thread.run(Thread.java:679)
>>>>
>>>> When this happens, these 3 brokers essentially go out of sync when you
>>>>do
>>>> a ³kafka-topics.sh ‹describe².
>>>>
>>>> I tracked the number of open files by doing ³watch ­n 1 Œsudo lsof |
>>>>wc
>>>> ­l¹², which basically counts all open files on the system. The numbers
>>>>for
>>>> the systems are basically in the 6000 range, with one system going to
>>>>9000.
>>>> I presume the 9000 machine is the controller. Looking at the ulimit of
>>>>the
>>>> user, both the hard limit and the soft limit for open files is
>>>>100,000.
>>>> Using sysctl, the max file is fs.file-max = 9774928. So we seem to be
>>>>way
>>>> under the limit.
>>>>
>>>> What am I missing here? Is there some JVM limit around 10K open files
>>>>or
>>>> something?
>>>>
>>>> Paul Lung
>>>>
>>
>

Reply via email to