Arpan created KAFKA-5153:
----------------------------

             Summary: KAFKA Cluster : 0.10.2.0 : Servers Getting disconnected : 
Service Impacting
                 Key: KAFKA-5153
                 URL: https://issues.apache.org/jira/browse/KAFKA-5153
             Project: Kafka
          Issue Type: Bug
         Environment: RHEL 6
Java Version  1.8.0_91-b14
            Reporter: Arpan
            Priority: Critical
         Attachments: server_1_72server.log, server_2_73_server.log, 
server_3_74Server.log, server.properties, ThreadDump_1493564142.dump, 
ThreadDump_1493564177.dump, ThreadDump_1493564249.dump

Hi Team, 

I was earlier referring to issue KAFKA-4477 because the problem i am facing is 
similar. I tried to search the same reference in release docs as well but did 
not get anything in 0.10.1.1 or 0.10.2.0. I am currently using 2.11_0.10.2.0.

I am have 3 node cluster for KAFKA and cluster for ZK as well on the same set 
of servers in cluster mode. We are having around 240GB of data getting 
transferred through KAFKA everyday. What we are observing is disconnect of the 
server from cluster and ISR getting reduced and it starts impacting service.

I have also observed file descriptor count getting increased a bit, in normal 
circumstances we have not observed FD count more than 500 but when issue 
started we were observing it in the range of 650-700 on all 3 servers. 
Attaching thread dumps of all 3 servers when we started facing the issue 
recently.

The issue get vanished once you bounce the nodes and the set up is not working 
more than 5 days without this issue. Attaching server logs as well.

Kindly let me know if you need any additional information. Attaching 
server.properties as well for one of the server (It's similar on all 3 serversP)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to