[jira] [Commented] (KAFKA-1097) Race condition while reassigning low throughput partition leads to incorrect ISR information in zookeeper

Neha Narkhede (JIRA) Tue, 22 Oct 2013 14:58:53 -0700

    [ 
https://issues.apache.org/jira/browse/KAFKA-1097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13802324#comment-13802324
 ]


Neha Narkhede commented on KAFKA-1097:
--------------------------------------

[~guozhang] The ISR shrink logic marks replicas as possibly stuck only if the 
replicas are lagging behind the leader's highwatermark. You can argue that the 
leader can keep track of fully caught up replicas that have stopped fetching 
and kick the replicas out of ISR based on the replica's long poll timeout, but 
that is pretty complicated. So if the partition is not getting any more data 
and the old replica is fully caught up with the leader, there is nothing much 
the leader can do to shrink the ISR.

> Race condition while reassigning low throughput partition leads to incorrect 
> ISR information in zookeeper 
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-1097
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1097
>             Project: Kafka
>          Issue Type: Bug
>          Components: controller
>    Affects Versions: 0.8
>            Reporter: Neha Narkhede
>            Assignee: Neha Narkhede
>            Priority: Critical
>             Fix For: 0.8
>
>
> While moving partitions, the controller moves the old replicas through the 
> following state changes -
> ONLINE -> OFFLINE -> NON_EXISTENT
> During the offline state change, the controller removes the old replica and 
> writes the updated ISR to zookeeper and notifies the leader. Note that it 
> doesn't notify the old replicas to stop fetching from the leader (to be fixed 
> in KAFKA-1032). During the non-existent state change, the controller does not 
> write the updated ISR or replica list to zookeeper. Right after the 
> non-existent state change, the controller writes the new replica list to 
> zookeeper, but does not update the ISR. So an old replica can send a fetch 
> request after the offline state change, essentially letting the leader add it 
> back to the ISR. The problem is that if there is no new data coming in for 
> the partition and the old replica is fully caught up, the leader cannot 
> remove it from the ISR. That lets a non existent replica live in the ISR at 
> least until new data comes in to the partition



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (KAFKA-1097) Race condition while reassigning low throughput partition leads to incorrect ISR information in zookeeper

Reply via email to