Hanish Bansal created KAFKA-1193:
------------------------------------

             Summary: Data loss if broker is killed using kill -9
                 Key: KAFKA-1193
                 URL: https://issues.apache.org/jira/browse/KAFKA-1193
             Project: Kafka
          Issue Type: Bug
          Components: replication
    Affects Versions: 0.8.0, 0.8.1
         Environment: Centos 6.3
            Reporter: Hanish Bansal
            Assignee: Neha Narkhede


We are having kafka cluster of 2 nodes. (Using Kafka 0.8.0 version)
Replication Factor: 2
Number of partitions: 2

Actual Behaviour:
-------------------------
Out of two nodes, if leader node goes down then data lost happens.

Steps to Reproduce:
------------------------------
1. Create a 2 node kafka cluster with replication factor 2
2. Start the Kafka cluster
3. Create a topic lets say "test-trunk111"
4. Restart any one node.
5. Check topic status using kafka-list-topic tool.
topic isr status is:

topic: test-trunk111    partition: 0    leader: 0    replicas: 1,0    isr: 0,1
topic: test-trunk111    partition: 1    leader: 0    replicas: 0,1    isr: 0,1

If there is only one broker node in isr list then wait for some time and again 
check isr status of topic. There should be 2 brokers in isr list.
6. Start producing the data.
7. Kill leader node (borker-0 in our case) meanwhile of data producing.
8. After all data is produced start consumer.
9. Observe the behaviour. There is data loss.

After leader goes down, topic isr status is:

topic: test-trunk111    partition: 0    leader: 1    replicas: 1,0    isr: 1
topic: test-trunk111    partition: 1    leader: 1    replicas: 0,1    isr: 1

We have tried below things to avoid data loss:
----------------------------------------------------------------

1. Configured "request.required.acks=-1" in producer configuration because as 
mentioned in documentation 
http://kafka.apache.org/documentation.html#producerconfigs, setting this value 
to -1 provides guarantee that no messages will be lost.
2. Increased the "message.send.max.retries" from 3 to 10 in producer 
configuration.

3. Set "controlled.shutdown.enable" to true in broker configuration.

4. Tested with Kafka-0.8.1 after applying patch KAFKA-1188.patch available on 
https://issues.apache.org/jira/browse/KAFKA-1188 

Nothing work out from above things in case of leader node is killed using "kill 
-9 <pid>".

Expected Behaviour:
----------------------------
No data should be lost.




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to