[ 
https://issues.apache.org/jira/browse/KAFKA-6843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Glasser updated KAFKA-6843:
---------------------------------
    Summary: Document issue with Zookeeper DNS name resolutions changing  (was: 
Document issue with DNS TTL)

> Document issue with Zookeeper DNS name resolutions changing
> -----------------------------------------------------------
>
>                 Key: KAFKA-6843
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6843
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: David Glasser
>            Priority: Major
>
> We run Kafka and Zookeeper in Google Kubernetes Engine. We have recently had 
> problems where our brokers had serious problems when GKE replaced our cluster 
> (cycling both Zookeeper and Kafka in parallel).  Kafka (1.0) brokers lost the 
> ability the talk to Zookeeper, and eventually failed their controlled 
> shutdown, leading to slow startup times for the new broker and outages for 
> our system.
> We eventually tracked this down to the fact that (at least in our 
> environment) the default JVM DNS caching behavior is to cache results 
> forever.  We rely on DNS to connect to Zookeeper, and the DNS resolution 
> changes when the Zookeeper pods are replaced.
> The fix is straightforward: setting the property networkaddress.cache.ttl or 
> sun.net.inetaddr.ttl to make the caching non-infinite (or use a "security 
> manager"). See 
> [https://docs.oracle.com/javase/8/docs/technotes/guides/net/properties.html] 
> for details.
> I think this gotcha should be documented. Probably at 
> [https://kafka.apache.org/11/documentation/#java] ? I'm happy to submit a PR 
> if people agree this is the right place.  (I suppose somehow fixing this in 
> code would be nice too.)
> By the way, if you search the Apache issue tracker for 
> [networkaddress.cache.ttl|https://issues.apache.org/jira/browse/JAMES-774?jql=text%20~%20%22%5C%22networkaddress.cache.ttl%5C%22%22],
>  you'll learn that this is a common issue faced by many Apache Java projects.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to