Hi, Stephane, Thanks for the reply.
1) Fixing the issue in ZK will be ideal. Not sure when it will happen though. Once it's fixed, we can probably deprecate this config. 2) That could be useful. Is there a java api to do that at runtime? Also, invalidating DNS cache doesn't always fix the issue of unresolved host. In some of the cases, human intervention is needed. 3) The default timeout is infinite though. Jun On Sat, Oct 28, 2017 at 11:48 PM, Stephane Maarek < steph...@simplemachines.com.au> wrote: > Hi Jun, > > I think this is very helpful. Restarting Kafka brokers in case of zookeeper > host change is not a well known operation. > > Few questions: > 1) would it not be worth fixing the problem at the source ? This has been > stuck for a while though, maybe a little push would help : > https://issues.apache.org/jira/plugins/servlet/mobile#issue/ZOOKEEPER-2184 > > 2) upon recreating the zookeeper object , is it not possible to invalidate > the DNS cache so that it resolves the new hostname? > > 3) could the cluster be down in this situation: one migrates an entire > zookeeper cluster to new machines (one by one). The quorum is still alive > without downtime, but now every broker in a cluster can't resolve zookeeper > at the same time. They all shut down at the same time after the new > time-out setting. > > Thanks ! > Stéphane > > On 28 Oct. 2017 9:42 am, "Jun Rao" <j...@confluent.io> wrote: > > > Hi, Everyone, > > > > We created "KIP-217: Expose a timeout to allow an expired ZK session to > be > > re-created". > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP- > > 217%3A+Expose+a+timeout+to+allow+an+expired+ZK+session+to+be+re-created > > > > Please take a look and provide your feedback. > > > > Thanks, > > > > Jun > > >