Looks great, Gwen. I've added a few comments to the ticket.

On Mon, Oct 20, 2014 at 2:32 PM, Gwen Shapira <gshap...@cloudera.com> wrote:

> Hi Kyle,
>
> I added new documentation, which will hopefully help. Please take a look
> here:
> https://issues.apache.org/jira/browse/KAFKA-1555
>
> I've heard rumors that you are very very good at documenting, so I'm
> looking forward to your comments.
>
> Note that I'm completely ignoring the acks>1 case since we are about
> to remove it.
>
> Gwen
>
> On Wed, Oct 15, 2014 at 1:21 PM, Kyle Banker <kyleban...@gmail.com> wrote:
> > Thanks very much for these clarifications, Gwen.
> >
> > I'd recommend modifying the following phrase describing "acks=-1":
> >
> > "This option provides the best durability, we guarantee that no messages
> > will be lost as long as at least one in sync replica remains."
> >
> > The "as long as at least one in sync replica remains" is such a huge
> > caveat. It should be noted that "acks=-1" provides no actual durability
> > guarantees unless min.isr is also used to specify a majority of replicas.
> >
> > In addition, I was curious if you might comment on my other recent
> posting
> > "Consistency and Availability on Node Failures" and possibly add this
> > scenario to the docs. With acks=-1 and min.isr=2 and a 3-replica topic
> in a
> > 12-node Kafka cluster, there's a relatively high probability that losing
> 2
> > nodes from this cluster will result in an inability to write to the
> cluster.
> >
> > On Tue, Oct 14, 2014 at 4:50 PM, Gwen Shapira <gshap...@cloudera.com>
> wrote:
> >
> >> ack = 2 *will* throw an exception when there's only one node in ISR.
> >>
> >> The problem with ack=2 is that if you have 3 replicas and you got acks
> >> from 2 of them, the one replica which did not get the message can
> >> still be in ISR and get elected as leader, leading for a loss of the
> >> message. If you specify ack=3, you can't tolerate the failure of a
> >> single replica. Not amazing either.
> >>
> >> To makes things even worse, when specifying the number of acks you
> >> want, you don't always know how many replicas the topic should have,
> >> so its difficult to pick the correct number.
> >>
> >> acks = -1 solves that problem (since all messages need to get acked by
> >> all replicas), but introduces the new problem of not getting an
> >> exception if ISR shrank to 1 replica.
> >>
> >> Thats why the min.isr configuration was added.
> >>
> >> I hope this clarifies things :)
> >> I'm planning to add this to the docs in a day or two, so let me know
> >> if there are any additional explanations or scenarios you think we
> >> need to include.
> >>
> >> Gwen
> >>
> >> On Tue, Oct 14, 2014 at 12:27 PM, Scott Reynolds <sreyno...@twilio.com>
> >> wrote:
> >> > A question about 0.8.1.1 and acks. I was under the impression that
> >> setting
> >> > acks to 2 will not throw an exception when there is only one node in
> ISR.
> >> > Am I incorrect ? Thus the need for min_isr.
> >> >
> >> > On Tue, Oct 14, 2014 at 11:50 AM, Kyle Banker <kyleban...@gmail.com>
> >> wrote:
> >> >
> >> >> It's quite difficult to infer from the docs the exact techniques
> >> required
> >> >> to ensure consistency and durability in Kafka. I propose that we add
> a
> >> doc
> >> >> section detailing these techniques. I would be happy to help with
> this.
> >> >>
> >> >> The basic question is this: assuming that I can afford to temporarily
> >> halt
> >> >> production to Kafka, how do I ensure that no message written to
> Kafka is
> >> >> ever lost under typical failure scenarios (i.e., the loss of a single
> >> >> broker)?
> >> >>
> >> >> Here's my understanding of this for Kafka v0.8.1.1:
> >> >>
> >> >> 1. Create a topic with a replication factor of 3.
> >> >> 2. Use a sync producer and set acks to 2. (Setting acks to -1 may
> >> >> successfully write even in a case where the data is written only to a
> >> >> single node).
> >> >>
> >> >> Even with these two precautions, there's always the possibility of an
> >> >> "unclean leader election." Can data loss still occur in this
> scenario?
> >> Is
> >> >> it possible to achieve this level of durability on v0.8.1.1?
> >> >>
> >> >> In Kafka v0.8.2, in addition to the above:
> >> >>
> >> >> 3. Ensure that the triple-replicated topic also disallows unclean
> leader
> >> >> election (https://issues.apache.org/jira/browse/KAFKA-1028).
> >> >>
> >> >> 4. Set the min.isr value of the producer to 2 and acks to -1 (
> >> >> https://issues.apache.org/jira/browse/KAFKA-1555). The producer will
> >> then
> >> >> throw an exception if data can't be written to 2 out of 3 nodes.
> >> >>
> >> >> In addition to producer configuration and usage, there are also
> >> monitoring
> >> >> and operations considerations for achieving durability and
> consistency.
> >> As
> >> >> those are rather nuanced, it'd probably be easiest to just start
> >> iterating
> >> >> on a document to flesh those out.
> >> >>
> >> >> If anyone has any advice on how to better specify this, or how to get
> >> >> started on improving the docs, I'm happy to help out.
> >> >>
> >>
>

Reply via email to