[ https://issues.apache.org/jira/browse/KAFKA-9491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jason Gustafson resolved KAFKA-9491. ------------------------------------ Resolution: Fixed > Fast election during reassignment can lead to replica fetcher failures > ---------------------------------------------------------------------- > > Key: KAFKA-9491 > URL: https://issues.apache.org/jira/browse/KAFKA-9491 > Project: Kafka > Issue Type: Bug > Reporter: Jason Gustafson > Assignee: Jason Gustafson > Priority: Major > Fix For: 2.5.0, 2.4.1 > > > We have observed an unusual case in which a new replica became leader before > it had received an initial high watermark from the previous leader. This > resulted in an OffsetOutOfRangeException being raised while looking up the > segment position of the uninitialized high watermark, since it was lower than > the log start offset. The error was raised while handle the fetch request > from one of the followers and prevented it from making progress. > {code} > org.apache.kafka.common.errors.OffsetOutOfRangeException: Received request > for offset 0 for partition foo-0, but we only have log segments in the range > 20 to 20. > {code} > Here is what we have observed from the logs. The initial state of the > partition for the relevant sequence of events is the following: > Initial state: replicas=[4,1,2,3], leader=1, isr=[1,2,3], adding=[4], > removing=[1], epoch=5, logStartOffset=20, logEndOffset=20 > We see the following events: > t0: Replica 4 becomes follower and initializes log with hw=0, logStartOffset=0 > t1: Replica 4 begins fetching from offset 0 and receives an out of range error > t2: After a ListOffset request to the leader, replica 4 initializes > logStartOffset to 20. > t3: Replica 4 sends fetch request to the leader at start offset 20 > t4: Upon receiving the fetch request, the leader adds 4 to the ISR (i.e. > isr=[1,2,3,4]) > t5: The controller notices the ISR addition and makes 4 the leader since 1 is > to be removed and 4 is the new preferred leader > t6: Replica 4 stops fetchers and becomes leader > t7: We begin seeing the out of range errors as the other replicas begin > fetching from 4. > We know from analysis of a heap dump from broker 4, that the high watermark > was still set to 0 some time after it had become leader. We also know that > broker 1 was under significant load. The time between events t4 and t6 was > less than 10ms. We don't know when the fetch response sent at t3 returned to > broker 4, but we speculate that it happened after t6 due to the heavy load on > the leader, which is why broker 4 had an uninitialized high watermark. > A more mundane possibility is that there is a bug in the fetch session logic > and the partition was simply not included in the fetch response. However, the > code appears to anticipate this case. When a partition has an error, we set > the cached high watermark to -1 to ensure that it gets updated as soon as the > error clears. > Regardless how we got there, the fix should be straightforward. When a broker > becomes leader, it should ensure its high watermark is at least as large as > the log start offset. -- This message was sent by Atlassian Jira (v8.3.4#803005)