We are using a hardware loadbalancer with a VIP based ruby producer. On Sep 26, 2013, at 7:37 AM, Jun Rao <[email protected]> wrote:
> Are you using the java or non-java producer? Are you using ZK based, > broker-list based, or VIP based producer? > > Thanks, > > Jun > > > On Wed, Sep 25, 2013 at 10:06 PM, Nicolas Berthet > <[email protected]>wrote: > >> Jun, >> >> I observed similar kind of things recently. (didn't notice before because >> our file limit is huge) >> >> I have a set of brokers in a datacenter, and producers in different data >> centers. >> >> At some point I got disconnections, from the producer perspective I had >> something like 15 connections to the broker. On the other hand on the >> broker side, I observed hundreds of connections from the producer in an >> ESTABLISHED state. >> >> We had some default settings for the socket timeout on the OS level, which >> we reduced hoping it would prevent the issue in the future. I'm not sure if >> the issue is from the broker or OS configuration though. I'm still keeping >> the broker under observation for the time being. >> >> Note that, for clients in the same datacenter, we didn't see this issue, >> the socket count matches on both ends. >> >> Nicolas Berthet >> >> -----Original Message----- >> From: Jun Rao [mailto:[email protected]] >> Sent: Thursday, September 26, 2013 12:39 PM >> To: [email protected] >> Subject: Re: Too many open files >> >> If a client is gone, the broker should automatically close those broken >> sockets. Are you using a hardware load balancer? >> >> Thanks, >> >> Jun >> >> >> On Wed, Sep 25, 2013 at 4:48 PM, Mark <[email protected]> wrote: >> >>> FYI if I kill all producers I don't see the number of open files drop. >>> I still see all the ESTABLISHED connections. >>> >>> Is there a broker setting to automatically kill any inactive TCP >>> connections? >>> >>> >>> On Sep 25, 2013, at 4:30 PM, Mark <[email protected]> wrote: >>> >>>> Any other ideas? >>>> >>>> On Sep 25, 2013, at 9:06 AM, Jun Rao <[email protected]> wrote: >>>> >>>>> We haven't seen any socket leaks with the java producer. If you >>>>> have >>> lots >>>>> of unexplained socket connections in established mode, one possible >>> cause >>>>> is that the client created new producer instances, but didn't close >>>>> the >>> old >>>>> ones. >>>>> >>>>> Thanks, >>>>> >>>>> Jun >>>>> >>>>> >>>>> On Wed, Sep 25, 2013 at 6:08 AM, Mark <[email protected]> >>> wrote: >>>>> >>>>>> No. We are using the kafka-rb ruby gem producer. >>>>>> https://github.com/acrosa/kafka-rb >>>>>> >>>>>> Now that you asked that question I need to ask. Is there a problem >>>>>> with the java producer? >>>>>> >>>>>> Sent from my iPhone >>>>>> >>>>>>> On Sep 24, 2013, at 9:01 PM, Jun Rao <[email protected]> wrote: >>>>>>> >>>>>>> Are you using the java producer client? >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Jun >>>>>>> >>>>>>> >>>>>>>> On Tue, Sep 24, 2013 at 5:33 PM, Mark >>>>>>>> <[email protected]> >>>>>> wrote: >>>>>>>> >>>>>>>> Our 0.7.2 Kafka cluster keeps crashing with: >>>>>>>> >>>>>>>> 2013-09-24 17:21:47,513 - [kafka-acceptor:Acceptor@153] - Error >>>>>>>> in acceptor >>>>>>>> java.io.IOException: Too many open >>>>>>>> >>>>>>>> The obvious fix is to bump up the number of open files but I'm >>> wondering >>>>>>>> if there is a leak on the Kafka side and/or our application >>>>>>>> side. We currently have the ulimit set to a generous 4096 but >>>>>>>> obviously we are hitting this ceiling. What's a recommended value? >>>>>>>> >>>>>>>> We are running rails and our Unicorn workers are connecting to >>>>>>>> our >>> Kafka >>>>>>>> cluster via round-robin load balancing. We have about 1500 >>>>>>>> workers to >>>>>> that >>>>>>>> would be 1500 connections right there but they should be split >>>>>>>> across >>>>>> our 3 >>>>>>>> nodes. Instead Netstat shows thousands of connections that look >>>>>>>> like >>>>>> this: >>>>>>>> >>>>>>>> tcp 0 0 kafka1.mycompany.:XmlIpcRegSvc ::ffff: >>>>>> 10.99.99.1:22503 ESTABLISHED >>>>>>>> tcp 0 0 kafka1.mycompany.:XmlIpcRegSvc ::ffff: >>>>>> 10.99.99.1:48398 ESTABLISHED >>>>>>>> tcp 0 0 kafka1.mycompany.:XmlIpcRegSvc ::ffff: >>>>>> 10.99.99.2:29617 ESTABLISHED >>>>>>>> tcp 0 0 kafka1.mycompany.:XmlIpcRegSvc ::ffff: >>>>>> 10.99.99.1:32444 ESTABLISHED >>>>>>>> tcp 0 0 kafka1.mycompany.:XmlIpcRegSvc ::ffff: >>>>>> 10.99.99.1:34415 ESTABLISHED >>>>>>>> tcp 0 0 kafka1.mycompany.:XmlIpcRegSvc ::ffff: >>>>>> 10.99.99.1:56901 ESTABLISHED >>>>>>>> tcp 0 0 kafka1.mycompany.:XmlIpcRegSvc ::ffff: >>>>>> 10.99.99.2:45349 ESTABLISHED >>>>>>>> >>>>>>>> Has anyone come across this problem before? Is this a 0.7.2 >>>>>>>> leak, LB misconfiguration... ? >>>>>>>> >>>>>>>> Thanks >>>>>> >>>> >>> >>> >>
