[ https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14148407#comment-14148407 ]
Sriram Subramanian commented on KAFKA-1555: ------------------------------------------- Thank you for summarizing all the thoughts Jay. 1. I had issues with how ack was designed initially with the min_isr config and it looks a lot better now with ack = 0, ack = 1 and ack = -1. I still think ack should be an enum explaining what it does rather than using -1 or any arbitrary integers. 2. I don't see the value of min_isr if it does not prevent data loss under unclean leader election. If it was a clean leader election, we would always have one other replica that has the data and min_isr does not add any more value. It is completely possible to ensure there is no data loss with unclean leader election using the min_isr and I think that is the real benefit of it. 3. Has I had said previously, I like the sender to know what guarantees they get when they send the request and would opt for min_isr being exposed at the API level. 4. W.r.t your last point, I think it may not be possible to avoid duplicates by failing before writing to the log. The reason is that the isr could become less than min_isr just after the check and we could still end up failing the request after a timeout. Agreed, this is an edge case and we end up with a lot less duplicates. So I think, you would need the check in both places. > provide strong consistency with reasonable availability > ------------------------------------------------------- > > Key: KAFKA-1555 > URL: https://issues.apache.org/jira/browse/KAFKA-1555 > Project: Kafka > Issue Type: Improvement > Components: controller > Affects Versions: 0.8.1.1 > Reporter: Jiang Wu > Assignee: Gwen Shapira > Fix For: 0.8.2 > > Attachments: KAFKA-1555.0.patch, KAFKA-1555.1.patch, > KAFKA-1555.2.patch, KAFKA-1555.3.patch, KAFKA-1555.4.patch, KAFKA-1555.5.patch > > > In a mission critical application, we expect a kafka cluster with 3 brokers > can satisfy two requirements: > 1. When 1 broker is down, no message loss or service blocking happens. > 2. In worse cases such as two brokers are down, service can be blocked, but > no message loss happens. > We found that current kafka versoin (0.8.1.1) cannot achieve the requirements > due to its three behaviors: > 1. when choosing a new leader from 2 followers in ISR, the one with less > messages may be chosen as the leader. > 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it > has less messages than the leader. > 3. ISR can contains only 1 broker, therefore acknowledged messages may be > stored in only 1 broker. > The following is an analytical proof. > We consider a cluster with 3 brokers and a topic with 3 replicas, and assume > that at the beginning, all 3 replicas, leader A, followers B and C, are in > sync, i.e., they have the same messages and are all in ISR. > According to the value of request.required.acks (acks for short), there are > the following cases. > 1. acks=0, 1, 3. Obviously these settings do not satisfy the requirement. > 2. acks=2. Producer sends a message m. It's acknowledged by A and B. At this > time, although C hasn't received m, C is still in ISR. If A is killed, C can > be elected as the new leader, and consumers will miss m. > 3. acks=-1. B and C restart and are removed from ISR. Producer sends a > message m to A, and receives an acknowledgement. Disk failure happens in A > before B and C replicate m. Message m is lost. > In summary, any existing configuration cannot satisfy the requirements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)