[
https://issues.apache.org/jira/browse/KAFKA-4414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15670636#comment-15670636
]
Meyer Kizner commented on KAFKA-4414:
-------------------------------------
What value would you suggest? We're already using 5000ms, which I thought was
relatively short. A shorter timeout makes the issue less likely, but it looks
like there's a race condition here.
> Unexpected "Halting because log truncation is not allowed"
> ----------------------------------------------------------
>
> Key: KAFKA-4414
> URL: https://issues.apache.org/jira/browse/KAFKA-4414
> Project: Kafka
> Issue Type: Bug
> Affects Versions: 0.9.0.1
> Reporter: Meyer Kizner
>
> Our Kafka installation runs with unclean leader election disabled, so brokers
> halt when they find that their message offset is ahead of the leader's offset
> for a topic. We had two brokers halt today with this issue. After much time
> spent digging through the logs, I believe the following timeline describes
> what occurred and points to a plausible hypothesis as to what happened.
> * B1, B2, and B3 are replicas of a topic, all in the ISR. B2 is currently the
> leader, but B1 is the preferred leader. The controller runs on B3.
> * B1 fails, but the controller does not detect the failure immediately.
> * B2 receives a message from a producer and B3 fetches it to stay up to date.
> B2 has not accepted the message, because B1 is down and so has not
> acknowledged the message.
> * The controller triggers a preferred leader election, making B1 the leader,
> and notifies all replicas.
> * Very shortly afterwards (~200ms), B1's broker registration in ZooKeeper
> expires, so the controller reassigns B2 to be leader again and notifies all
> replicas.
> * Because B3 is the controller, while B2 is on another box, B3 hears about
> both of these events before B2 hears about either. B3 truncates its log to
> the high water mark (before the pending message) and resumes fetching from B2.
> * B3 fetches the pending message from B2 again.
> * B2 learns that it has been displaced and then reelected, and truncates its
> log to the high water mark, before the pending message.
> * The next time B3 tries to fetch from B2, it sees that B2 is missing the
> pending message and halts.
> In this case, there was no data loss or inconsistency. I haven't fully
> thought through whether either would be possible, but it seems likely that
> they would be, especially if there had been multiple producers to this topic.
> I'm not completely certain about this timeline, but this sequence of events
> appears to at least be possible. Looking a bit through the controller code,
> there doesn't seem to be anything that forces {{LeaderAndIsrRequest}} to be
> sent in a particular order. If someone with more knowledge of the code base
> believes this is incorrect, I'd be happy to post the logs and/or do some more
> digging.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)