[ https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gwen Shapira updated KAFKA-1555: -------------------------------- Attachment: KAFKA-1555.0.patch Tested the change with a 3-broker cluster. Taking one broker down to drop the ISR size. Seems working. But - I have few issues and would appreciate some advice: * Not sure I like how I’m handling the fact that log is an option (getting the defaults from new config if the log is not found, which really shouldn’t happen). Better ideas will be appreciated :) * I’d like to validate that min.insync.replicas > #replicas. But - not sure where to do it. * I'm returning status of “true” with an error from checkEnoughReplicasReachOffset(). This was supposed to indicate that there are enough acks, even though there were not enough replicas. This is pretty meaningless since isSatisfied completely ignores the “hasEnough” flag if we are sending an error. So eventHandler sees the error message: response.status.filter(_._2.error != ErrorMapping.NoError) and re-tries 3 times and the message is written 3 times. I’m pretty sure this is not what we wanted. However, not returning an error is not an option either. Should we just document the behavior and let the producer decide what to do with the retries? > provide strong consistency with reasonable availability > ------------------------------------------------------- > > Key: KAFKA-1555 > URL: https://issues.apache.org/jira/browse/KAFKA-1555 > Project: Kafka > Issue Type: Improvement > Components: controller > Affects Versions: 0.8.1.1 > Reporter: Jiang Wu > Assignee: Gwen Shapira > Fix For: 0.8.2 > > Attachments: KAFKA-1555.0.patch > > > In a mission critical application, we expect a kafka cluster with 3 brokers > can satisfy two requirements: > 1. When 1 broker is down, no message loss or service blocking happens. > 2. In worse cases such as two brokers are down, service can be blocked, but > no message loss happens. > We found that current kafka versoin (0.8.1.1) cannot achieve the requirements > due to its three behaviors: > 1. when choosing a new leader from 2 followers in ISR, the one with less > messages may be chosen as the leader. > 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it > has less messages than the leader. > 3. ISR can contains only 1 broker, therefore acknowledged messages may be > stored in only 1 broker. > The following is an analytical proof. > We consider a cluster with 3 brokers and a topic with 3 replicas, and assume > that at the beginning, all 3 replicas, leader A, followers B and C, are in > sync, i.e., they have the same messages and are all in ISR. > According to the value of request.required.acks (acks for short), there are > the following cases. > 1. acks=0, 1, 3. Obviously these settings do not satisfy the requirement. > 2. acks=2. Producer sends a message m. It's acknowledged by A and B. At this > time, although C hasn't received m, C is still in ISR. If A is killed, C can > be elected as the new leader, and consumers will miss m. > 3. acks=-1. B and C restart and are removed from ISR. Producer sends a > message m to A, and receives an acknowledgement. Disk failure happens in A > before B and C replicate m. Message m is lost. > In summary, any existing configuration cannot satisfy the requirements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)