Hi Jun, I think this is very helpful. Restarting Kafka brokers in case of zookeeper host change is not a well known operation.
Few questions: 1) would it not be worth fixing the problem at the source ? This has been stuck for a while though, maybe a little push would help : https://issues.apache.org/jira/plugins/servlet/mobile#issue/ZOOKEEPER-2184 2) upon recreating the zookeeper object , is it not possible to invalidate the DNS cache so that it resolves the new hostname? 3) could the cluster be down in this situation: one migrates an entire zookeeper cluster to new machines (one by one). The quorum is still alive without downtime, but now every broker in a cluster can't resolve zookeeper at the same time. They all shut down at the same time after the new time-out setting. Thanks ! Stéphane On 28 Oct. 2017 9:42 am, "Jun Rao" <j...@confluent.io> wrote: > Hi, Everyone, > > We created "KIP-217: Expose a timeout to allow an expired ZK session to be > re-created". > > https://cwiki.apache.org/confluence/display/KAFKA/KIP- > 217%3A+Expose+a+timeout+to+allow+an+expired+ZK+session+to+be+re-created > > Please take a look and provide your feedback. > > Thanks, > > Jun >