[ https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14148520#comment-14148520 ]
Sriram Subramanian commented on KAFKA-1555: ------------------------------------------- 2. I agree. I think what min_isr helps in is to have a way to specify "I don't want to loose my data as long as min_isr - 1 number of nodes are down". For example, if no_of_replicas=3, min_isr = 2 and ack=-1, we should not loose data as long as one node is down even when there is an unclean leader election. In this particular case, when the leader fails, it is expected that all replica nodes are up but could be out of the isr. Under such constraints it is definitely possible to prevent data loss (ignoring data loss due to system failures and data not flushed to disk) by making the node with the longest log (assuming we ensure they don't diverge) as the leader. 3. I prefer b or c. d is attractive since you could use just one variable to define your required guarantees but it is hard to understand at the API level. 4. I totally agree. The issue is ISR takes a while to reflect the actual reality. Assume we failed early before writing to the local log and did not have any checks after writing. Replicas go down. It would take a while for the isr to reflect that the replicas are not in the isr anymore. During this time, we would simply write the messages to the log and loose it later. > provide strong consistency with reasonable availability > ------------------------------------------------------- > > Key: KAFKA-1555 > URL: https://issues.apache.org/jira/browse/KAFKA-1555 > Project: Kafka > Issue Type: Improvement > Components: controller > Affects Versions: 0.8.1.1 > Reporter: Jiang Wu > Assignee: Gwen Shapira > Fix For: 0.8.2 > > Attachments: KAFKA-1555.0.patch, KAFKA-1555.1.patch, > KAFKA-1555.2.patch, KAFKA-1555.3.patch, KAFKA-1555.4.patch, KAFKA-1555.5.patch > > > In a mission critical application, we expect a kafka cluster with 3 brokers > can satisfy two requirements: > 1. When 1 broker is down, no message loss or service blocking happens. > 2. In worse cases such as two brokers are down, service can be blocked, but > no message loss happens. > We found that current kafka versoin (0.8.1.1) cannot achieve the requirements > due to its three behaviors: > 1. when choosing a new leader from 2 followers in ISR, the one with less > messages may be chosen as the leader. > 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it > has less messages than the leader. > 3. ISR can contains only 1 broker, therefore acknowledged messages may be > stored in only 1 broker. > The following is an analytical proof. > We consider a cluster with 3 brokers and a topic with 3 replicas, and assume > that at the beginning, all 3 replicas, leader A, followers B and C, are in > sync, i.e., they have the same messages and are all in ISR. > According to the value of request.required.acks (acks for short), there are > the following cases. > 1. acks=0, 1, 3. Obviously these settings do not satisfy the requirement. > 2. acks=2. Producer sends a message m. It's acknowledged by A and B. At this > time, although C hasn't received m, C is still in ISR. If A is killed, C can > be elected as the new leader, and consumers will miss m. > 3. acks=-1. B and C restart and are removed from ISR. Producer sends a > message m to A, and receives an acknowledgement. Disk failure happens in A > before B and C replicate m. Message m is lost. > In summary, any existing configuration cannot satisfy the requirements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)