Pretty much. It’s not actually related to zookeeper. Generalising a bit, replication factor 2 means Kafka can lose 1 machine and be ok.
B > On 1 Jun 2016, at 12:46, Hafsa Asif <hafsa.a...@matchinguu.com> wrote: > > So, it means that I should create topics with at least replication-factor=2 > inspite of how many servers in a kafka cluster. If any server goes down or > slows down then zookeeper will not go out-of-sync. > Currently, my all topics are with eplication-factor= 1 and I got an issue > that Zookeeper goes out of sync. So, increasing replication-factor will > solve the issue? > > Hafsa > > 2016-06-01 12:57 GMT+02:00 Ben Stopford <b...@confluent.io>: > >> Hi Hafa >> >> If you create a topic with replication-factor = 2, you can lose one of >> them without losing data, so long as they were "in sync". Replicas can fall >> out of sync if one of the machines runs slow. The system tracks in sync >> replicas. These are exposed by JMX too. Check out the docs on replication >> for more details: >> >> http://kafka.apache.org/090/documentation.html#replication < >> http://kafka.apache.org/090/documentation.html#replication> >> >> B >> >>> On 1 Jun 2016, at 10:45, Hafsa Asif <hafsa.a...@matchinguu.com> wrote: >>> >>> Hello Jayesh, >>> >>> Thank you very much for such a good description. My further questions are >>> (just to be my self clear about the concept). >>> >>> 1. If I have only one partition in a 'Topic' in a Kafka with following >>> configuration, >>> bin/kafka-topics.sh --create --zookeeper localhost:2181 >>> --replication-factor 1 --partitions 1 --topic mytopic1 >>> Then still I need to rebalance topic partitions while node >> adding/removing >>> in Kafka cluster? >>> >>> 2. What is the actual meaning of this line 'if all your topics have >> atleast >>> 2 insync replicas'. My understanding is that, I need to create replica of >>> each topic in each server. e.g: I have two servers in a Kafka cluster >> then >>> I need to create topic 'mytopic1' in both servers. It helps to get rid of >>> any problem while removing any of the server. >>> >>> I will look in detail into your provided link. Many thanks for this. >>> >>> Looking forward for the answers from also other Kafka ninjas as well :) >>> >>> Best, >>> Hafsa >>> >>> 2016-05-31 18:50 GMT+02:00 Thakrar, Jayesh <jthak...@conversantmedia.com >>> : >>> >>>> Hafsa, Florin >>>> >>>> First thing first, it is possible to scale a Kafka cluster up or down >>>> (i.e. add/remove servers). >>>> And as has been noted in this thread, after you add a server to a >> cluster, >>>> you need to rebalance the topic partitions in order to put the newly >> added >>>> server into use. >>>> And similarly, before you remove a server, it is advised that you drain >>>> off the data from the server to be removed (its not a hard requirement, >> if >>>> all your topics have atleast 2 insync replicas, including the server >> being >>>> removed and you intend to rebalance after server removal). >>>> >>>> However, "automating" the rebalancing of topic partitions is not >> trivial. >>>> >>>> There is a KIP out there to help with the rebalancing , but lacks >> details >>>> - >>>> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-6+-+New+reassignment+partition+logic+for+rebalancing >>>> My guess is due to its non-trivial nature AND the number of cases one >>>> needs to take care of - e.g. scaling up by 5% v/s scaling up by 50% in >> say, >>>> a 20 node cluster. >>>> Furthermore, to be really effective, one needs to be cognizant of the >>>> partition sizes, and with rack-awareness, the task becomes even more >>>> involved. >>>> >>>> Regards, >>>> Jayesh >>>> >>>> -----Original Message----- >>>> From: Spico Florin [mailto:spicoflo...@gmail.com] >>>> Sent: Tuesday, May 31, 2016 9:44 AM >>>> To: users@kafka.apache.org >>>> Subject: Re: Rebalancing issue while Kafka scaling >>>> >>>> Hi! >>>> What version of Kafka you are using? What do you mean by "Kafka needs >>>> rebalacing?" Rebalancing of what? Can you please be more specific. >>>> >>>> Regards, >>>> Florin >>>> >>>> >>>> >>>> On Tue, May 31, 2016 at 4:58 PM, Hafsa Asif <hafsa.a...@matchinguu.com> >>>> wrote: >>>> >>>>> Hello Folks, >>>>> >>>>> Today , my team members shows concern that whenever we increase node >>>>> in Kafka cluster, Kafka needs rebalancing. The rebalancing is sort of >>>>> manual and not-good step whenever scaling happens. Second, if Kafka >>>>> scales up then it cannot be scale down. Please provide us proper >>>>> guidance over this issue, may be we have not enough configuration >>>> properties. >>>>> >>>>> Hafsa >>>>> >>>> >>>> >>>> >>>> >>>> This email and any files included with it may contain privileged, >>>> proprietary and/or confidential information that is for the sole use >>>> of the intended recipient(s). Any disclosure, copying, distribution, >>>> posting, or use of the information contained in or attached to this >>>> email is prohibited unless permitted by the sender. If you have >>>> received this email in error, please immediately notify the sender >>>> via return email, telephone, or fax and destroy this original >> transmission >>>> and its included files without reading or saving it in any manner. >>>> Thank you. >>>> >> >>