What OS settings did you change? How high is your huge file limit?
On Sep 25, 2013, at 10:06 PM, Nicolas Berthet <nicolasbert...@maaii.com> wrote: > Jun, > > I observed similar kind of things recently. (didn't notice before because our > file limit is huge) > > I have a set of brokers in a datacenter, and producers in different data > centers. > > At some point I got disconnections, from the producer perspective I had > something like 15 connections to the broker. On the other hand on the broker > side, I observed hundreds of connections from the producer in an ESTABLISHED > state. > > We had some default settings for the socket timeout on the OS level, which we > reduced hoping it would prevent the issue in the future. I'm not sure if the > issue is from the broker or OS configuration though. I'm still keeping the > broker under observation for the time being. > > Note that, for clients in the same datacenter, we didn't see this issue, the > socket count matches on both ends. > > Nicolas Berthet > > -----Original Message----- > From: Jun Rao [mailto:jun...@gmail.com] > Sent: Thursday, September 26, 2013 12:39 PM > To: users@kafka.apache.org > Subject: Re: Too many open files > > If a client is gone, the broker should automatically close those broken > sockets. Are you using a hardware load balancer? > > Thanks, > > Jun > > > On Wed, Sep 25, 2013 at 4:48 PM, Mark <static.void....@gmail.com> wrote: > >> FYI if I kill all producers I don't see the number of open files drop. >> I still see all the ESTABLISHED connections. >> >> Is there a broker setting to automatically kill any inactive TCP >> connections? >> >> >> On Sep 25, 2013, at 4:30 PM, Mark <static.void....@gmail.com> wrote: >> >>> Any other ideas? >>> >>> On Sep 25, 2013, at 9:06 AM, Jun Rao <jun...@gmail.com> wrote: >>> >>>> We haven't seen any socket leaks with the java producer. If you >>>> have >> lots >>>> of unexplained socket connections in established mode, one possible >> cause >>>> is that the client created new producer instances, but didn't close >>>> the >> old >>>> ones. >>>> >>>> Thanks, >>>> >>>> Jun >>>> >>>> >>>> On Wed, Sep 25, 2013 at 6:08 AM, Mark <static.void....@gmail.com> >> wrote: >>>> >>>>> No. We are using the kafka-rb ruby gem producer. >>>>> https://github.com/acrosa/kafka-rb >>>>> >>>>> Now that you asked that question I need to ask. Is there a problem >>>>> with the java producer? >>>>> >>>>> Sent from my iPhone >>>>> >>>>>> On Sep 24, 2013, at 9:01 PM, Jun Rao <jun...@gmail.com> wrote: >>>>>> >>>>>> Are you using the java producer client? >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Jun >>>>>> >>>>>> >>>>>>> On Tue, Sep 24, 2013 at 5:33 PM, Mark >>>>>>> <static.void....@gmail.com> >>>>> wrote: >>>>>>> >>>>>>> Our 0.7.2 Kafka cluster keeps crashing with: >>>>>>> >>>>>>> 2013-09-24 17:21:47,513 - [kafka-acceptor:Acceptor@153] - Error >>>>>>> in acceptor >>>>>>> java.io.IOException: Too many open >>>>>>> >>>>>>> The obvious fix is to bump up the number of open files but I'm >> wondering >>>>>>> if there is a leak on the Kafka side and/or our application >>>>>>> side. We currently have the ulimit set to a generous 4096 but >>>>>>> obviously we are hitting this ceiling. What's a recommended value? >>>>>>> >>>>>>> We are running rails and our Unicorn workers are connecting to >>>>>>> our >> Kafka >>>>>>> cluster via round-robin load balancing. We have about 1500 >>>>>>> workers to >>>>> that >>>>>>> would be 1500 connections right there but they should be split >>>>>>> across >>>>> our 3 >>>>>>> nodes. Instead Netstat shows thousands of connections that look >>>>>>> like >>>>> this: >>>>>>> >>>>>>> tcp 0 0 kafka1.mycompany.:XmlIpcRegSvc ::ffff: >>>>> 10.99.99.1:22503 ESTABLISHED >>>>>>> tcp 0 0 kafka1.mycompany.:XmlIpcRegSvc ::ffff: >>>>> 10.99.99.1:48398 ESTABLISHED >>>>>>> tcp 0 0 kafka1.mycompany.:XmlIpcRegSvc ::ffff: >>>>> 10.99.99.2:29617 ESTABLISHED >>>>>>> tcp 0 0 kafka1.mycompany.:XmlIpcRegSvc ::ffff: >>>>> 10.99.99.1:32444 ESTABLISHED >>>>>>> tcp 0 0 kafka1.mycompany.:XmlIpcRegSvc ::ffff: >>>>> 10.99.99.1:34415 ESTABLISHED >>>>>>> tcp 0 0 kafka1.mycompany.:XmlIpcRegSvc ::ffff: >>>>> 10.99.99.1:56901 ESTABLISHED >>>>>>> tcp 0 0 kafka1.mycompany.:XmlIpcRegSvc ::ffff: >>>>> 10.99.99.2:45349 ESTABLISHED >>>>>>> >>>>>>> Has anyone come across this problem before? Is this a 0.7.2 >>>>>>> leak, LB misconfiguration... ? >>>>>>> >>>>>>> Thanks >>>>> >>> >> >>