Re: Segment recovery and replication

Sriram Subramanian Thu, 29 Aug 2013 08:52:22 -0700

Do you know why you timed out on a regular shutdown? If the replica had
fallen off of the ISR and shutdown was forced on the leader this could
happen. With ack = -1, we guarantee that all the replicas in the in sync
set have received the message before exposing the message to the consumer.


On 8/29/13 8:32 AM, "Sam Meder" <sam.me...@jivesoftware.com> wrote:

>We've recently come across a scenario where we see consumers resetting
>their offsets to earliest and which as far as I can tell may also lead to
>data loss (we're running with ack = -1 to avoid loss). This seems to
>happen when we time out on doing a regular shutdown and instead kill -9
>the kafka broker, but does obviously apply to any scenario that involves
>a unclean exit. As far as I can tell what happens is
>
>1. On restart the broker truncates the data for the affected partitions,
>i.e. not all data was written to disk.
>2. The new broker then becomes a leader for the affected partitions and
>consumers get confused because they've already consumed beyond the now
>available offset.
>
>Does that seem like a possible failure scenario?
>
>/Sam

Re: Segment recovery and replication

Reply via email to