[ https://issues.apache.org/jira/browse/KAFKA-4414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15670636#comment-15670636 ]
Meyer Kizner commented on KAFKA-4414: ------------------------------------- What value would you suggest? We're already using 5000ms, which I thought was relatively short. A shorter timeout makes the issue less likely, but it looks like there's a race condition here. > Unexpected "Halting because log truncation is not allowed" > ---------------------------------------------------------- > > Key: KAFKA-4414 > URL: https://issues.apache.org/jira/browse/KAFKA-4414 > Project: Kafka > Issue Type: Bug > Affects Versions: 0.9.0.1 > Reporter: Meyer Kizner > > Our Kafka installation runs with unclean leader election disabled, so brokers > halt when they find that their message offset is ahead of the leader's offset > for a topic. We had two brokers halt today with this issue. After much time > spent digging through the logs, I believe the following timeline describes > what occurred and points to a plausible hypothesis as to what happened. > * B1, B2, and B3 are replicas of a topic, all in the ISR. B2 is currently the > leader, but B1 is the preferred leader. The controller runs on B3. > * B1 fails, but the controller does not detect the failure immediately. > * B2 receives a message from a producer and B3 fetches it to stay up to date. > B2 has not accepted the message, because B1 is down and so has not > acknowledged the message. > * The controller triggers a preferred leader election, making B1 the leader, > and notifies all replicas. > * Very shortly afterwards (~200ms), B1's broker registration in ZooKeeper > expires, so the controller reassigns B2 to be leader again and notifies all > replicas. > * Because B3 is the controller, while B2 is on another box, B3 hears about > both of these events before B2 hears about either. B3 truncates its log to > the high water mark (before the pending message) and resumes fetching from B2. > * B3 fetches the pending message from B2 again. > * B2 learns that it has been displaced and then reelected, and truncates its > log to the high water mark, before the pending message. > * The next time B3 tries to fetch from B2, it sees that B2 is missing the > pending message and halts. > In this case, there was no data loss or inconsistency. I haven't fully > thought through whether either would be possible, but it seems likely that > they would be, especially if there had been multiple producers to this topic. > I'm not completely certain about this timeline, but this sequence of events > appears to at least be possible. Looking a bit through the controller code, > there doesn't seem to be anything that forces {{LeaderAndIsrRequest}} to be > sent in a particular order. If someone with more knowledge of the code base > believes this is incorrect, I'd be happy to post the logs and/or do some more > digging. -- This message was sent by Atlassian JIRA (v6.3.4#6332)