Re: Kafka safe Rolling restart

Robert Christ Sat, 09 Apr 2016 10:49:15 -0700

Hi Yashodhan,

I do this quite frequently and if I understand your
question correctly, it is the default behavior.

If you issue a normal TERM signal to the kafka process
(or call kafka-server-stop.sh) it will start controlled
shutdown which will migrate leadership for all the partitions
it is currently leading to other brokers in the ISR.  This
will not happen if there are no other brokers in the ISR so
you probably don't want to start this until you have no
under replicated partitions.  You can check with:

kafka-topics.sh --zookeeper xxx --describe --under-replicated-partitions

After controlled shutdown completes the process should exit.
I have seen some cases where the broker appears to complete
shutdown and has moved leadership for all of its partitions
but it does not exit the process.  If this happens I have
just issued a hard KILL to the process.  It is possible that
I am just impatient and the process will eventually exit.

When you restart the broker, it will catch up on all
the partitions for which it is in the replica list.
Hopefully it will quickly enter the ISR as well.

Next, by default kafka has auto leader rebalancing enabled.
It is controlled by this parameter:

auto.leader.rebalance.enable  (default: true)

These two parameters also control the rebalancing:

leader.imbalance.check.interval.seconds   (default: 300)
leader.imbalance.per.broker.percentage     (default: 10)

So, on average about 150 seconds after your broker has
returned a rebalancing event should occur.  This will
move the leadership back to your broker for partitions
where it is the preferred leader which just means it
shows up first in the replica list and it is in the ISR.

There is also the script:

kafka-preferred-replica-election.sh

which will trigger the election.  I have only tried it once
but it appeared to do the job.

Good luck,
  rob

> On Apr 8, 2016, at 9:03 PM, Yashodhan Kocharekar <ykoch...@tibco.com> wrote:
>
> hi i am trying write a script for safe rolling restart of
> kafka_2.9.2-0.8.1.1 cluster , high level workflow is
>
> for each broker do
>   1. move partition replica leadership from current_broker to others
>   2. broker restart
>   3. restore  leadership borker
>
> now i have found a script to do 1.
> https://gist.github.com/miguno/87d5b2411e3f93e80866
> i am not sure how to do 3.

________________________________

This email and any attachments may contain confidential and privileged material 
for the sole use of the intended recipient. Any review, copying, or 
distribution of this email (or any attachments) by others is prohibited. If you 
are not the intended recipient, please contact the sender immediately and 
permanently delete this email and any attachments. No employee or agent of TiVo 
Inc. is authorized to conclude any binding agreement on behalf of TiVo Inc. by 
email. Binding agreements with TiVo Inc. may only be made by a signed written 
agreement.

Re: Kafka safe Rolling restart

Reply via email to