Stephane Maarek created KAFKA-4871:
--------------------------------------

             Summary: Kafka doesn't respect TTL on Zookeeper hostname - crash 
if zookeeper IP changes
                 Key: KAFKA-4871
                 URL: https://issues.apache.org/jira/browse/KAFKA-4871
             Project: Kafka
          Issue Type: Bug
    Affects Versions: 0.10.2.0
            Reporter: Stephane Maarek


I had a Zookeeper cluster that automatically obtains hostname so that they 
remain constant over time. I deleted my 3 zookeeper machines and new machines 
came back online, with the same hostname, and they updated their CNAME

Kafka then failed and couldn't reconnect to Zookeeper as it didn't try to 
resolve the IP of Zookeeper again. See log below:

[2017-03-09 05:49:57,302] INFO Client will use GSSAPI as SASL mechanism. 
(org.apache.zookeeper.client.ZooKeeperSaslClient)
[2017-03-09 05:49:57,302] INFO Opening socket connection to server 
zookeeper-3.example.com/10.12.79.43:2181. Will attempt to SASL-authenticate 
using Login Context section 'Client' (org.apache.zookeeper.ClientCnxn)

[ec2-user]$ dig +short zookeeper-3.example.com
10.12.79.36

As you can see even though the machine is capable of finding the new hostname, 
Kafka somehow didn't respect the TTL (was set to 60 seconds) and didn't get the 
new IP. I feel that on failed Zookeeper connection, Kafka should at least try 
to resolve the new Zookeeper IP. That allows Kafka to keep up with Zookeeper 
changes over time

What do you think? Is that expected behaviour or a bug?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to