Hmm, this sounds like a serious bug. I do remember we have some ticket
reporting similar issues before but I cannot find it now. Let me dig a bit
deeper later.

BTW, could you try out the 0.8.2 broker version and see if this is still
easily re-producible, i.e. starting a bunch of producers to send data for a
while, and terminate them?

Guozhang

On Tue, Mar 10, 2015 at 1:00 PM, Allen Wang <aw...@netflix.com.invalid>
wrote:

> Hello,
>
> We are using Kafka 0.8.1.1 on the broker and 0.8.2 producer on the client.
> After running for a few days, we have found that there are way too many
> open file descriptors on the broker side. When we compare the connections
> on the client side, we found some connections are already gone on the
> client but still exists on the broker. Also there are connections on the
> broker where the producer instances are already terminated.
>
> We then did a netstat -o and found that the connections on the broker side
> does not have keep-alive enabled (as timewait is "off"):
>
> tcp6       0      0 kafka-xyz:7101 ip-a-b-c-d:33471 ESTABLISHED off
> (0.00/0/0)
>
> We suspect that because there is no keep-alive on the broker, there is no
> probing on the idle connections and therefore no connection clean up.
>
> There is a default 2 hours TCP keep alive set on the OS level on both
> sides:
>
> net.ipv4.tcp_keepalive_time = 7200
>
> On the producer side, keepalive is enabled on the connection:
>
> tcp6       0      0 ip-a-b-c-d:33471    kafka-xyz.:7101 ESTABLISHED
> keepalive (975.50/0/0)
>
> Is there anyway to clean up the idle producer connections on the broker
> side? Does keepalive helps cleaning up the idle connections?
>
> Thanks,
> Allen
>



-- 
-- Guozhang

Reply via email to