Hi Eno Thanks too, this is indeed helpful Best regards Patrik
> Am 18.03.2019 um 18:16 schrieb Eno Thereska <eno.there...@gmail.com>: > > Hi folks, > > The team here has come up with a couple of clarifying tips for > operationalizing Zookeeper for Kafka that we found missing from the > official documentation, and passed them along to share. If you find them > useful, I'm thinking of putting on > https://cwiki.apache.org/confluence/display/KAFKA/FAQ. Meanwhile any > feedback is appreciated. > > ------- > Operationalizing Zookeeper FAQ > > The discussion below uses a 3-instance Zookeeper cluster as an example. The > findings apply to a larger cluster as well, but you’ll need to adjust the > numbers. > > - Does it make sense to have a config with only 2 Zookeeper instances? > I.e., in zookeeper.properties file have two entries for server 1 and server > 2 only. A: No. A setup with 2 Zookeeper instances is not fault tolerant to > even 1 failure. If one of the Zookeeper instances fails, the remaining one > will not be functional since there is no quorum majority (1 out of 2 is not > majority). If you do a “stat” command on that remaining instance you’ll see > the output being “This ZooKeeper instance is not currently serving > requests”. > > - What if you end up with only 2 running Zookeeper instances, e.g., you > started with 3 but one failed? Isn’t that the same as the case above? A: No > it’s not the same scenario. First of all, the 3- instance setup did > tolerate 1 instance down. The 2 remaining Zookeeper instances will continue > to function because the quorum majority (2 out of 3) is there. > > - I had a 3 Zookeeper instance setup and one instance just failed. How > should I recover? A: Restart the failed instance with the same > configuration it had before (i.e., same “myid” ID file, and same IP > address). It is not important to recover the data volume of the failed > instance, but it is a bonus if you do so. Once the instance comes up, it > will sync with the other 2 Zookeeper instances and get all the data. > > - I had a 3 Zookeeper instance setup and two instances failed. How should I > recover? Is my Zookeeper cluster even running at that point? A: First of > all, ZooKeeper is now unavailable and the remaining instance will show > “This ZooKeeper instance is not currently serving requests” if probed. > Second, you should make sure this situation is extremely rare. It should be > possible to recover the first failed instance quickly before the second > instance fails. Third, bring up the two failed instances one by one without > changing anything in their config. Similarly to the case above, it is not > important to recover the data volume of the failed instance, but it is a > bonus if you do so. Once the instance comes up, it will sync with the other > 1 ZooKeeper instance and get all the data. > > - I had a 3 Zookeeper instance setup and two instances failed. I can’t > recover the failed instances for whatever reason. What should I do? A: You > will have to restart the remaining healthy ZooKeeper in “standalone” mode > and restart all the brokers and point them to this standalone zookeeper > (instead of all 3 ZooKeepers). > > - The Zookeeper cluster is unavailable (for any of the reasons mentioned > above, e.g., no quorum, all instances have failed). What is the impact on > Kafka clients? What is the impact on brokers? A: The Zookeeper cluster is > unavailable (for any of the reasons mentioned above, e.g., no quorum, all > instances have failed). What is the impact on Kafka applications > producing/consuming? What is the impact on admin tools to manage topics and > cluster? What is the impact on brokers? A: Applications will be able to > continue producing and consuming, at least for a while. This is true if the > ZooKeeper cluster is temporarily unavailable but eventually becomes > available (after a few mins). On the other hand, if the ZooKeeper cluster > is permanently unavailable, then applications will slowly start to see > problems with producing/consuming especially if some brokers fail, because > the partition leaders will not be distributed to other brokers. So taking > one extreme, if the ZooKeeper cluster is down for a month, it is very > likely that applications will get produce/consume errors. Admin tools > (e.g., that create topics, set ACLs or change configs) will not work. > Brokers will not be impacted from Zookeeper being unavailable. They will > periodically try to reconnect to the ZooKeeper cluster. If you take care to > use the same IP address for a recovered Zookeeper instance as it had before > it failed, brokers will not need to be restarted. > ------ > > Cheers, > Eno