Re: how to ensure strong consistency with reasonable availability

Scott Clasen Thu, 24 Jul 2014 09:47:56 -0700

Thanks for the Jira info.

Just to clarify, in the case we are outlining above, would the producer
would have received an ack on m2 (with acks = -1) or not?  If not, then I
have no concerns, if so, then how would the producer know to re-publish?



On Thu, Jul 24, 2014 at 9:38 AM, Jun Rao <jun...@gmail.com> wrote:

> About re-publishing m2. it seems it's better to let the producer choose
> whether to do this or not.
>
> There is another known bug KAFKA-1211 that's not fixed yet. The situation
> when this can happen is relatively rare and the fix is slightly involved.
> So, it may not be addressed in 0.8.2.
>
> Thanks,
>
> Jun
>
>
> On Tue, Jul 22, 2014 at 8:36 PM, scott@heroku <sc...@heroku.com> wrote:
>
> > Thanks so much for the detailed explanation Jun, it pretty much lines up
> > with my understanding.
> >
> > In the case below, if we didn't particularly care about ordering and
> > re-produced m2, it would then become m5, and in many use cases this would
> > be ok?
> >
> > Perhaps a more direct question would be, once 0.8.2 is out and I have a
> > topic with unclean leader election disabled, and produce with acks = -1,
> >  are there any known series of events (other than disk failures on all
> > brokers) that would cause the loss of messages that a producer has
> received
> > an ack for?
> >
> >
> >
> >
> >
> > Sent from my iPhone
> >
> > > On Jul 22, 2014, at 8:17 PM, Jun Rao <jun...@gmail.com> wrote:
> > >
> > > They key point is that we have to keep all replicas consistent with
> each
> > > other such that no matter which replica a consumer reads from, it
> always
> > > reads the same data on a given offset.  The following is an example.
> > >
> > > Suppose we have 3 brokers A, B and C. Let's assume A is the leader and
> at
> > > some point, we have the following offsets and messages in each replica.
> > >
> > > offset   A   B   C
> > > 1        m1  m1  m1
> > > 2        m2
> > >
> > > Let's assume that message m1 is committed and message m2 is not. At
> > exactly
> > > this moment, replica A dies. After a new leader is elected, say B,  new
> > > messages can be committed with just replica B and C. Some point later
> if
> > we
> > > commit two more messages m3 and m4, we will have the following.
> > >
> > > offset   A   B   C
> > > 1        m1  m1  m1
> > > 2        m2  m3  m3
> > > 3            m4  m4
> > >
> > > Now A comes back. For consistency, it's important for A's log to be
> > > identical to B and C. So, we have to remove m2 from A's log and add m3
> > and
> > > m4. As you can see, whether you want to republish m2 or not, m2 cannot
> > stay
> > > in its current offset, since in other replicas, that offset is already
> > > taken by other messages. Therefore, a truncation of replica A's log is
> > > needed to keep the replicas consistent. Currently, we don republish
> > > messages like m2 since (1) it's not necessary since it's never
> considered
> > > committed; (2) it will make our protocol more complicated.
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > >
> > >
> > >
> > >> On Tue, Jul 22, 2014 at 3:40 PM, scott@heroku <sc...@heroku.com>
> wrote:
> > >>
> > >> Thanks Jun
> > >>
> > >> Can you explain a little more about what an uncommitted message means?
> > >> The messages are in the log so presumably? they have been acked at
> least
> > >> by the the local broker.
> > >>
> > >> I guess I am hoping for some intuition around why 'replaying' the
> > messages
> > >> in question would cause bad things.
> > >>
> > >> Thanks!
> > >>
> > >>
> > >> Sent from my iPhone
> > >>> On Jul 22, 2014, at 3:06 PM, Jun Rao <jun...@gmail.com> wrote:
> > >>>
> > >>> Scott,
> > >>>
> > >>> The reason for truncation is that the broker that comes back may have
> > >> some
> > >>> un-committed messages. Those messages shouldn't be exposed to the
> > >> consumer
> > >>> and therefore need to be removed from the log. So, on broker startup,
> > we
> > >>> first truncate the log to a safe point before which we know all
> > messages
> > >>> are committed. This broker will then sync up with the current leader
> to
> > >> get
> > >>> the remaining messages.
> > >>>
> > >>> Thanks,
> > >>>
> > >>> Jun
> > >>>
> > >>>
> > >>>> On Tue, Jul 22, 2014 at 9:42 AM, Scott Clasen <sc...@heroku.com>
> > wrote:
> > >>>>
> > >>>> Ahh, yes that message loss case. I've wondered about that myself.
> > >>>>
> > >>>> I guess I dont really understand why truncating messages is ever the
> > >> right
> > >>>> thing to do.  As kafka is an 'at least once' system. (send a
> message,
> > >> get
> > >>>> no ack, it still might be on the topic) consumers that care will
> have
> > to
> > >>>> de-dupe anyhow.
> > >>>>
> > >>>> To the kafka designers:  is there anything preventing implementation
> > of
> > >>>> alternatives to truncation? when a broker comes back online and
> needs
> > to
> > >>>> truncate, cant it fire up a producer and take the extra messages and
> > >> send
> > >>>> them back to the original topic or alternatively an error topic?
> > >>>>
> > >>>> Would love to understand the rationale for the current design, as my
> > >>>> perspective is doubtfully as clear as the designers'
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> On Tue, Jul 22, 2014 at 6:21 AM, Jiang Wu (Pricehistory) (BLOOMBERG/
> > 731
> > >>>> LEX -) <jwu...@bloomberg.net> wrote:
> > >>>>
> > >>>>> kafka-1028 addressed another unclean leader election problem. It
> > >> prevents
> > >>>>> a broker not in ISR from becoming a leader. The problem we are
> facing
> > >> is
> > >>>>> that a broker in ISR but without complete messages may become a
> > leader.
> > >>>>> It's also a kind of unclean leader election, but not the one that
> > >>>>> kafka-1028 addressed.
> > >>>>>
> > >>>>> Here I'm trying to give a proof that current kafka doesn't achieve
> > the
> > >>>>> requirement (no message loss, no blocking when 1 broker down) due
> to
> > >> its
> > >>>>> two behaviors:
> > >>>>> 1. when choosing a new leader from 2 followers in ISR, the one with
> > >> less
> > >>>>> messages may be chosen as the leader
> > >>>>> 2. even when replica.lag.max.messages=0, a follower can stay in ISR
> > >> when
> > >>>>> it has less messages than the leader.
> > >>>>>
> > >>>>> We consider a cluster with 3 brokers and a topic with 3 replicas.
> We
> > >>>>> analyze different cases according to the value of
> > request.required.acks
> > >>>>> (acks for short). For each case and it subcases, we find situations
> > >> that
> > >>>>> either message loss or service blocking happens. We assume that at
> > the
> > >>>>> beginning, all 3 replicas, leader A, followers B and C, are in
> sync,
> > >>>> i.e.,
> > >>>>> they have the same messages and are all in ISR.
> > >>>>>
> > >>>>> 1. acks=0, 1, 3. Obviously these settings do not satisfy the
> > >> requirement.
> > >>>>> 2. acks=2. Producer sends a message m. It's acknowledged by A and
> B.
> > At
> > >>>>> this time, although C hasn't received m, it's still in ISR. If A is
> > >>>> killed,
> > >>>>> C can be elected as the new leader, and consumers will miss m.
> > >>>>> 3. acks=-1. Suppose replica.lag.max.messages=M. There are two
> > >> sub-cases:
> > >>>>> 3.1 M>0. Suppose C be killed. C will be out of ISR after
> > >>>>> replica.lag.time.max.ms. Then the producer publishes M messages
> to A
> > >> and
> > >>>>> B. C restarts. C will join in ISR since it is M messages behind A
> and
> > >> B.
> > >>>>> Before C replicates all messages, A is killed, and C becomes
> leader,
> > >> then
> > >>>>> message loss happens.
> > >>>>> 3.2 M=0. In this case, when the producer publishes at a high
> speed, B
> > >> and
> > >>>>> C will fail out of ISR, only A keeps receiving messages. Then A is
> > >>>> killed.
> > >>>>> Either message loss or service blocking will happen, depending on
> > >> whether
> > >>>>> unclean leader election is enabled.
> > >>>>>
> > >>>>>
> > >>>>> From: users@kafka.apache.org At: Jul 21 2014 22:28:18
> > >>>>> To: JIANG WU (PRICEHISTORY) (BLOOMBERG/ 731 LEX -),
> > >>>> users@kafka.apache.org
> > >>>>> Subject: Re: how to ensure strong consistency with reasonable
> > >>>> availability
> > >>>>>
> > >>>>> You will probably need 0.8.2  which gives
> > >>>>> https://issues.apache.org/jira/browse/KAFKA-1028
> > >>>>>
> > >>>>>
> > >>>>> On Mon, Jul 21, 2014 at 6:37 PM, Jiang Wu (Pricehistory)
> (BLOOMBERG/
> > >> 731
> > >>>>> LEX -) <jwu...@bloomberg.net> wrote:
> > >>>>>
> > >>>>>> Hi everyone,
> > >>>>>>
> > >>>>>> With a cluster of 3 brokers and a topic of 3 replicas, we want to
> > >>>> achieve
> > >>>>>> the following two properties:
> > >>>>>> 1. when only one broker is down, there's no message loss, and
> > >>>>>> procuders/consumers are not blocked.
> > >>>>>> 2. in other more serious problems, for example, one broker is
> > >> restarted
> > >>>>>> twice in a short period or two brokers are down at the same time,
> > >>>>>> producers/consumers can be blocked, but no message loss is
> allowed.
> > >>>>>>
> > >>>>>> We haven't found any producer/broker paramter combinations that
> > >> achieve
> > >>>>>> this. If you know or think some configurations will work, please
> > post
> > >>>>>> details. We have a test bed to verify any given configurations.
> > >>>>>>
> > >>>>>> In addition, I'm wondering if it's necessary to open a jira to
> > require
> > >>>>> the
> > >>>>>> above feature?
> > >>>>>>
> > >>>>>> Thanks,
> > >>>>>> Jiang
> > >>
> >
>

Re: how to ensure strong consistency with reasonable availability

Reply via email to