Jiangjie Qin created KAFKA-3003:
-----------------------------------

             Summary: The fetch.wait.max.ms is not honored when new log segment 
rolled for low volume topics.
                 Key: KAFKA-3003
                 URL: https://issues.apache.org/jira/browse/KAFKA-3003
             Project: Kafka
          Issue Type: Bug
            Reporter: Jiangjie Qin
            Assignee: Jiangjie Qin


The problem we saw can be explained by the example below:

1. Message offset 100 is appended to partition p0, log segment 00000000.log. at 
time T. After that no message is appended. 
2. This message is replicated, leader replica update its 
highWatermark.messageOffset=100, highWatermark.segmentBaseOffset=0.
3. At time T + retention.ms, because no message has been appended to current 
active log segment for retention.ms, the last modified time of the current log 
segment reaches retention time. 
4. Broker rolls out a new log segment 00000001.log, and deletes the old log 
segment 00000000.log. The new log segment in this case is empty because there 
is no message appended. 
5. In Log, the nextOffsetMetadata.segmentBaseOffset will be updated to the new 
log segment's base offset, but nextOffsetMetadata.messageOffset does not 
change. so nextOffsetMetadata.messageOffset=1, 
nextOffsetMetadata.segmentBaseOffset=1.
6. Now a FetchRequest comes and try to fetch from offset 1, 
fetch.wait.max.ms=1000.
7. In ReplicaManager, because there is no data to return, the fetch request 
will be put into purgatory. When delayedFetchPurgatory.tryCompleteElseWatch() 
is called, the DelayedFetch.tryComplete() compares replica.highWatermark and 
the fetchOffset returned by log.read(), it will see the 
replica.highWatermark.segmentBaseOffset=0 and fetchOffset.segmentBaseOffset=1. 
So it will assume the fetch occurs on a later segment and complete the delayed 
fetch immediately.

In this case, the replica.highWatermark was not updated because the 
LogOffsetMetadata.preceds() only checks the messageOffset but ignored 
segmentBaseOffset. The fix is to let LogOffsetMetadata first check the 
messageOffset then check the segmentBaseOffset. So replica.highWatermark will 
get updated after the follower fetches from the leader.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to