Hi Michal,

Thanks for the perfect links. They really help. Now it looks like with
request.required.acks=1 (let alone 0) messages can be lost in the case I
described. The aphyr's article, seemingly, describes a more tricky case
than I have.

I'm still not sure on Kafka behavior in case of request.required.acks=-1.
With this setting in effect, the scenario will turn into this:

1. Producer sends message A to the leader.
2. Leader stores the message, followers fetch it. Everyone's in sync.
3. Producer sends message B to the leader. Followers haven't still fetched
the message and lag by 1 message: the B is still only on the broker 1. The
produce request for message B sits in the purgatory waiting for
acknowledgement from the followers.
4. The leader goes down.
5. Followers cannot fetch B anymore, since its only owner is down. Yet some
of the replicas needs to take over the leader responsibility. Say, broker 2
now becomes the leader, 3 is the follower.
6. Producer sends message C to the leader (broker 2). Follower fetches it.
7. Broker 1 goes back online and starts following broker 2.

What happens with message B that was sitting in the 1's purgatory when it
went down? Will an error "unable to send message B" be returned to the
producer immediately after broker 1 shutdown?


2014-06-30 17:41 GMT+04:00 Michal Michalski <michal.michal...@boxever.com>:

> Hi Yury,
>
> If I understand correctly, the case you're describing is equivalent to the
> leader re-election (in terms of data consistency). In that case messages
> can be lost depending on your "acks" setting:
>
> https://kafka.apache.org/documentation.html
> see: request.required.acks:
> E.g. "only messages that were written to the now-dead leader but not yet
> replicated will be lost)." for acks=1
>
> More info on that:
> http://aphyr.com/posts/293-call-me-maybe-kafka
>
> However, I'd be happy if someone with more Kafka experience confirmed my
> understanding of that issue.
>
>
> Kind regards,
> MichaƂ Michalski,
> michal.michal...@boxever.com
>
>
> On 30 June 2014 14:34, Yury Ruchin <yuri.ruc...@gmail.com> wrote:
>
> > Thanks!
> >
> > I'm also trying to understand how replicas will catch up once the leader
> > goes down. Say, we have 3 brokers with IDs 1, 2, 3. The leader is broker
> 1.
> > Followers are 2 and 3. Consider the following scenario assuming that all
> > messages fall into the same partition:
> >
> > 1. Producer sends message A to the leader.
> > 2. Leader stores the message, followers fetch it. Everyone's in sync.
> > 3. Producer sends message B to the leader. Followers haven't still
> fetched
> > the message and lag by 1 message: the B is still only on the broker 1.
> > 4. I bring the leader down.
> > 5. Followers cannot fetch B anymore, since its only owner is down. Yet
> some
> > of the replicas needs to take over the leader responsibility. Say,
> broker 2
> > now becomes the leader, 3 is the follower.
> > 6. Producer sends message C to the leader (broker 2). Follower fetches
> it.
> >
> > I don't quite understand the state of the log on replicas 2 and 3 after
> > step#6. It looks like the log will have a gap in it. The expected log
> state
> > is ["A", "B", "C"]. But brokers 2 and 3 didn't have a chance to fetch
> "B",
> > so their log looks like ["A", "C"]. Will Kafka try to fill the gap in the
> > background once broker 1 started over?
> >
> >
> > 2014-06-18 19:59 GMT+04:00 Neha Narkhede <neha.narkh...@gmail.com>:
> >
> > > You don't gain much by running #4 between broker bounces. Running it
> > after
> > > the cluster is upgraded will be sufficient.
> > >
> > > Thanks,
> > > Neha
> > >
> > >
> > > On Wed, Jun 18, 2014 at 8:33 AM, Yury Ruchin <yuri.ruc...@gmail.com>
> > > wrote:
> > >
> > > > Hi folks,
> > > >
> > > > In my project, we want to perform to update our active Kafka 0.8
> > cluster
> > > to
> > > > Kafka 0.8.1.1 without downtime and losing any data. The process
> (after
> > > > reading http://kafka.apache.org/documentation.html#upgrade) looks to
> > me
> > > > like this. For each broker in turn:
> > > >
> > > > 1. Bring the broker down.
> > > > 2. Update Kafka to 0.8.1.1 on the broker node.
> > > > 3. Start the broker.
> > > > 4. Run preferred-replica-election script to restore broker's
> leadership
> > > for
> > > > respective partitions.
> > > > 5. Wait for the the preferred replica election to complete.
> > > >
> > > > I deem step#5 necessary since preferred replica election is an
> > > asynchronous
> > > > process. There is a slim chance that bringing other brokers down
> before
> > > the
> > > > election is complete would result in all replicas down for some
> > > partitions,
> > > > so a portion of the incoming data stream would be lost. Is my
> > > understanding
> > > > of the process correct?
> > > >
> > >
> >
>

Reply via email to