[jira] [Commented] (KAFKA-1555) provide strong consistency with reasonable availability

Joel Koshy (JIRA) Thu, 30 Oct 2014 19:14:23 -0700

    [ 
https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191196#comment-14191196
 ]


Joel Koshy commented on KAFKA-1555:
-----------------------------------

Sorry for the late feedback.

configuration.html:
- min.insync.replicas: "When a producer sets request.required.acks"
  - I don't think we should start with that statement, because the
    min.insync.replicas concept is more general than having an effect on
    producer result. i.e., it is independent of the producer's ack config.
    It is used to help decide when to declare a message is committed (This
    is what I was trying to convey in my earlier comment).

design.html:
- Availability and durability guarantees:
  - Similar point - there are two concepts at play here: the ack setting on
    the producer and durability guarantee on the broker. The mechanism for
    committing messages is separate from the producer's ack setting. The
    section starts off from the perspective of the producer and then
    progresses into durability. I actually think it might be clearer to do
    it the other way around. i.e., if we say - a message received at the
    broker is committed (and declared durable and exposed to consumers) when
    the ISR set receives the message.. ISR can shrink though.. higher
    min.isr facilitates stronger durability guarantee.. and then talk about
    the producer ack setting.

This is just a suggestion - I'm not very sure if the above will be better or
not but I think it would be a more intuitive progression for users.


> provide strong consistency with reasonable availability
> -------------------------------------------------------
>
>                 Key: KAFKA-1555
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1555
>             Project: Kafka
>          Issue Type: Improvement
>          Components: controller
>    Affects Versions: 0.8.1.1
>            Reporter: Jiang Wu
>            Assignee: Gwen Shapira
>             Fix For: 0.8.2
>
>         Attachments: KAFKA-1555-DOCS.0.patch, KAFKA-1555-DOCS.1.patch, 
> KAFKA-1555-DOCS.2.patch, KAFKA-1555.0.patch, KAFKA-1555.1.patch, 
> KAFKA-1555.2.patch, KAFKA-1555.3.patch, KAFKA-1555.4.patch, 
> KAFKA-1555.5.patch, KAFKA-1555.5.patch, KAFKA-1555.6.patch, 
> KAFKA-1555.8.patch, KAFKA-1555.9.patch
>
>
> In a mission critical application, we expect a kafka cluster with 3 brokers 
> can satisfy two requirements:
> 1. When 1 broker is down, no message loss or service blocking happens.
> 2. In worse cases such as two brokers are down, service can be blocked, but 
> no message loss happens.
> We found that current kafka versoin (0.8.1.1) cannot achieve the requirements 
> due to its three behaviors:
> 1. when choosing a new leader from 2 followers in ISR, the one with less 
> messages may be chosen as the leader.
> 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it 
> has less messages than the leader.
> 3. ISR can contains only 1 broker, therefore acknowledged messages may be 
> stored in only 1 broker.
> The following is an analytical proof. 
> We consider a cluster with 3 brokers and a topic with 3 replicas, and assume 
> that at the beginning, all 3 replicas, leader A, followers B and C, are in 
> sync, i.e., they have the same messages and are all in ISR.
> According to the value of request.required.acks (acks for short), there are 
> the following cases.
> 1. acks=0, 1, 3. Obviously these settings do not satisfy the requirement.
> 2. acks=2. Producer sends a message m. It's acknowledged by A and B. At this 
> time, although C hasn't received m, C is still in ISR. If A is killed, C can 
> be elected as the new leader, and consumers will miss m.
> 3. acks=-1. B and C restart and are removed from ISR. Producer sends a 
> message m to A, and receives an acknowledgement. Disk failure happens in A 
> before B and C replicate m. Message m is lost.
> In summary, any existing configuration cannot satisfy the requirements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1555) provide strong consistency with reasonable availability

Reply via email to