Hi Rob, thanks for detailed reply , let me share more info on this, unfortunately we are in production running traffic of ~ 25,000 message/sec on Kafka *0.8.1.1* cluster , now
*controlled.shutdown.enable = false* this property enables controlled shutdown behaviour but it is set to false by default. *auto.leader.rebalance.enable = false * is also set to false by default and setting these properties to true requires broker restart. see https://kafka.apache.org/081/documentation.html#brokerconfigs We need a broker restart across cluster to increase *log.retention.hour* so we are planning to piggyback on this and set > > *controlled.shutdown.enable,**auto.leader.rebalance.enable* to true so that on next restart we expect the default behaviour. But this does not help us with first restart. Given above setting i used the script > https://gist.github.com/miguno/87d5b2411e3f93e80866 to move leadership away from a broker, and do a restart. then to restore leadership i tried running kafka-preferred-replica-election.sh but it does not seem to work, the broker just restarted does come into ISR list but not in prefered position (first in the list). and running kafka-preferred-replica-election.sh does not change the leadership balance. Now we are taking all these pains to minimize data loss and prevent duplicates while we do rolling restart , we are using kafka producers with setting *request.required.acks = 1* https://kafka.apache.org/081/documentation.html#producerconfigs <https://kafka.apache.org/081/documentation.html#producerconfigs> thanks, Yash On Sat, Apr 9, 2016 at 11:18 PM, Robert Christ <rchr...@tivo.com> wrote: > Hi Yashodhan, > > I do this quite frequently and if I understand your > question correctly, it is the default behavior. > > If you issue a normal TERM signal to the kafka process > (or call kafka-server-stop.sh) it will start controlled > shutdown which will migrate leadership for all the partitions > it is currently leading to other brokers in the ISR. This > will not happen if there are no other brokers in the ISR so > you probably don't want to start this until you have no > under replicated partitions. You can check with: > > kafka-topics.sh --zookeeper xxx --describe --under-replicated-partitions > > After controlled shutdown completes the process should exit. > I have seen some cases where the broker appears to complete > shutdown and has moved leadership for all of its partitions > but it does not exit the process. If this happens I have > just issued a hard KILL to the process. It is possible that > I am just impatient and the process will eventually exit. > > When you restart the broker, it will catch up on all > the partitions for which it is in the replica list. > Hopefully it will quickly enter the ISR as well. > > Next, by default kafka has auto leader rebalancing enabled. > It is controlled by this parameter: > > auto.leader.rebalance.enable (default: true) > > These two parameters also control the rebalancing: > > leader.imbalance.check.interval.seconds (default: 300) > leader.imbalance.per.broker.percentage (default: 10) > > So, on average about 150 seconds after your broker has > returned a rebalancing event should occur. This will > move the leadership back to your broker for partitions > where it is the preferred leader which just means it > shows up first in the replica list and it is in the ISR. > > There is also the script: > > kafka-preferred-replica-election.sh > > which will trigger the election. I have only tried it once > but it appeared to do the job. > > Good luck, > rob > > > > On Apr 8, 2016, at 9:03 PM, Yashodhan Kocharekar <ykoch...@tibco.com> > wrote: > > > > hi i am trying write a script for safe rolling restart of > > kafka_2.9.2-0.8.1.1 cluster , high level workflow is > > > > for each broker do > > 1. move partition replica leadership from current_broker to others > > 2. broker restart > > 3. restore leadership borker > > > > now i have found a script to do 1. > > https://gist.github.com/miguno/87d5b2411e3f93e80866 > > i am not sure how to do 3. > > > ________________________________ > > This email and any attachments may contain confidential and privileged > material for the sole use of the intended recipient. Any review, copying, > or distribution of this email (or any attachments) by others is prohibited. > If you are not the intended recipient, please contact the sender > immediately and permanently delete this email and any attachments. No > employee or agent of TiVo Inc. is authorized to conclude any binding > agreement on behalf of TiVo Inc. by email. Binding agreements with TiVo > Inc. may only be made by a signed written agreement. >