Re: How to recover from a disk full situation in Kafka cluster?

Neha Narkhede Thu, 17 Jul 2014 16:21:08 -0700

Connie,

After we freed up the
cluster disk space and adjusted the broker data retention policy, we
noticed that the cluster partition was not balanced based on topic describe
script came from Kafka 0.8.1.1 distribution.

When you say the cluster was not balanced, did you mean the leaders or the
data? The describe topic tool does not give information about data sizes,
so I'm assuming you are referring to leader imbalance. If so, the right
tool to run is kafka-preferred-replica-election.sh not partition
reassignment. In general, assuming the partitions were evenly distributed
on your cluster before you ran out of disk space, the only thing you should
need to do to recover is delete a few older segments and bounce each
broker, one at a time. It is also preferrable to run preferred replica
election after a complete cluster bounce so the leaders are well
distributed.

Also, it will help if you can send around the output of the describe topic
tool. I wonder if your topics have a replication factor of 1 inadvertently?

Thanks,
Neha

On Thu, Jul 17, 2014 at 11:57 AM, Connie Yang <cybercon...@gmail.com> wrote:

> Hi All,
>
> Our Kafka cluster ran out of disk space yesterday.  After we freed up the
> cluster disk space and adjusted the broker data retention policy, we
> noticed that the cluster partition was not balanced based on topic describe
> script came from Kafka 0.8.1.1 distribution.  So, we tried to rebalance the
> partition using the kafka-reassign-partitions.sh. After sometime later, we
> ran out of disk space on 2 brokers in the cluster while the rest have
> plenty of disk space left.
>
> This seems to suggest that only two brokers were receiving messages.  We
> have not changed the broker partition from our producer which uses a random
> partition key strategy.
>
> String uuid = UUID.randomUUID().toString();
> KeyedMessage<String, String> data = new KeyedMessage<String, String>(
> "myKafkaTopic"
> uuid, msgBuilder.toString());
>
>
> Questions
> 1. Is partition reassignment required after disk full or when some of the
> brokers are not healthy?
> 2. Is there a broker config that we can use to auto rebalance the broker
> partition?  Should  "auto.leader.rebalance.enable" set to true?
> 2. How do we recover from situation like this?
>
> We pretty much use default configuration on the broker.
>
> Thanks,
> Connie
>

Re: How to recover from a disk full situation in Kafka cluster?

Reply via email to