[ https://issues.apache.org/jira/browse/KAFKA-6636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dong Lin resolved KAFKA-6636. ----------------------------- Resolution: Fixed This is no longer an issue after KAFKA-3978; Ensure high watermark is always positive. > ReplicaFetcherThread should not die if hw < 0 > --------------------------------------------- > > Key: KAFKA-6636 > URL: https://issues.apache.org/jira/browse/KAFKA-6636 > Project: Kafka > Issue Type: Improvement > Reporter: Dong Lin > Assignee: Dong Lin > Priority: Major > > ReplicaFetcherThread can die in the following scenario: > > 1) Partition P1 has replica set size 1. Broker A is the leader. The segment > is empty and log start offset is 100 > 2) User executes partition reassignment to change replica set from \{A} to > \{B, C} > 3) Broker B starts ReplicaFetcherThread, which triggers > handleOffsetOutOfRange(), truncates the log fully and start at offset 100. At > this moment its high watermark is still 0 (or -1). Same for broker C. > 4) Broker B sends FetchRequest to A at offset 100, broker A immediately adds > broker B to ISR set, and controller moves leadership to broker B. > 5) Broker B handles LeaderAndIsrRequest to become leader. It calls > `leaderReplica.convertHWToLocalOffsetMetadata()` to initialize its HW. Since > its HW was smaller than logStartOffset=100, now its HW will be overridden to > LogOffsetMetadata.UnknownOffsetMetadata, i.e. -1. > 6) Broker C handles LeaderAndIsrRequest to fetch from broker B. Broker C > updates its HW to the FetchRequest's HW, i.e. -1. Then broker C calls > replica.maybeIncrementLogStartOffset(leaderLogStartOffset) where > leaderLogStartOffset=100. This cause exception because leaderLogStartOffset > > HW. This is an unhandled exception and thus the ReplicaFetcherThread will exit -- This message was sent by Atlassian JIRA (v7.6.3#76005)