Our 0.7.2 Kafka cluster keeps crashing with:

2013-09-24 17:21:47,513 -  [kafka-acceptor:Acceptor@153] - Error in acceptor
        java.io.IOException: Too many open 

The obvious fix is to bump up the number of open files but I'm wondering if 
there is a leak on the Kafka side and/or our application side. We currently 
have the ulimit set to a generous 4096 but obviously we are hitting this 
ceiling. What's a recommended value? 

We are running rails and our Unicorn workers are connecting to our Kafka 
cluster via round-robin load balancing. We have about 1500 workers to that 
would be 1500 connections right there but they should be split across our 3 
nodes. Instead Netstat shows thousands of connections that look like this:

tcp        0      0 kafka1.mycompany.:XmlIpcRegSvc ::ffff:10.99.99.1:22503     
ESTABLISHED
tcp        0      0 kafka1.mycompany.:XmlIpcRegSvc ::ffff:10.99.99.1:48398     
ESTABLISHED
tcp        0      0 kafka1.mycompany.:XmlIpcRegSvc ::ffff:10.99.99.2:29617     
ESTABLISHED
tcp        0      0 kafka1.mycompany.:XmlIpcRegSvc ::ffff:10.99.99.1:32444     
ESTABLISHED
tcp        0      0 kafka1.mycompany.:XmlIpcRegSvc ::ffff:10.99.99.1:34415     
ESTABLISHED
tcp        0      0 kafka1.mycompany.:XmlIpcRegSvc ::ffff:10.99.99.1:56901     
ESTABLISHED
tcp        0      0 kafka1.mycompany.:XmlIpcRegSvc ::ffff:10.99.99.2:45349     
ESTABLISHED

Has anyone come across this problem before? Is this a 0.7.2 leak, LB 
misconfiguration… ?

Thanks

Reply via email to