[ https://issues.apache.org/jira/browse/KAFKA-4368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Manikumar resolved KAFKA-4368. ------------------------------ Resolution: Auto Closed Closing inactive issue. > Unclean shutdown breaks Kafka cluster > ------------------------------------- > > Key: KAFKA-4368 > URL: https://issues.apache.org/jira/browse/KAFKA-4368 > Project: Kafka > Issue Type: Bug > Components: producer > Affects Versions: 0.9.0.1, 0.10.0.0 > Reporter: Anukool Rattana > Priority: Critical > > My team has observed that if broker process die unclean then it will block > producer from sending messages to kafka topic. > Here is how to reproduce the problem: > 1) Create a Kafka 0.10 with three brokers (A, B and C). > 2) Create topic with replication_factor = 2 > 3) Set producer to send messages with "acks=all" meaning all replicas must be > created before able to proceed next message. > 4) Force IEM (IBM Endpoint Manager) to send patch to broker A and force > server to reboot after patches installed. > Note: min.insync.replicas = 1 > Result: - Producers are not able send messages to kafka topic after broker > rebooted and come back to join cluster with following error messages. > [2016-09-28 09:32:41,823] WARN Error while fetching metadata with correlation > id 0 : {logstash=LEADER_NOT_AVAILABLE} > (org.apache.kafka.clients.NetworkClient) > We suspected that number of replication_factor (2) is not sufficient to our > kafka environment but really need an explanation on what happen when broker > facing unclean shutdown. > The same issue occurred when setting cluster with 2 brokers and > replication_factor = 1. > The workaround i used to recover service is to cleanup both kafka topic log > file and zookeeper data (rmr /brokers/topics/XXX and rmr /consumers/XXX). > Note: > Topic list after A comeback from rebooted. > Topic:logstash PartitionCount:3 ReplicationFactor:2 Configs: > Topic: logstash Partition: 0 Leader: 1 Replicas: 1,3 Isr: > 1,3 > Topic: logstash Partition: 1 Leader: 2 Replicas: 2,1 Isr: > 2,1 > Topic: logstash Partition: 2 Leader: 3 Replicas: 3,2 Isr: > 2,3 -- This message was sent by Atlassian JIRA (v7.6.3#76005)