[jira] [Commented] (KAFKA-1097) Race condition while reassigning low throughput partition leads to incorrect ISR information in zookeeper

Joel Koshy (JIRA) Tue, 22 Oct 2013 15:04:48 -0700

    [ 
https://issues.apache.org/jira/browse/KAFKA-1097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13802335#comment-13802335
 ]


Joel Koshy commented on KAFKA-1097:
-----------------------------------

[~guozhang] Neha's comment wrt not getting any more data and the leader not 
shrinking the ISR is when the old replica is already fully caught up. The 
time-based shrinking happens only if the replica was at a smaller 
log-end-offset.

WRT this issue given that the old replica re-enters the ISR at the end of the 
new ISR, the likelihood of this being a issue (i.e., the old replica being 
elected as a leader) is relatively low (can you confirm?) That said, my 
preference would be the long-term fix on trunk instead of a partially correct 
fix on 0.8. However, the fact that we have an spurious/incorrect entry in the 
ISR list would skew the under-replicated partition count wouldn't it? In which 
case that would be a blocker issue.


> Race condition while reassigning low throughput partition leads to incorrect 
> ISR information in zookeeper 
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-1097
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1097
>             Project: Kafka
>          Issue Type: Bug
>          Components: controller
>    Affects Versions: 0.8
>            Reporter: Neha Narkhede
>            Assignee: Neha Narkhede
>            Priority: Critical
>             Fix For: 0.8
>
>
> While moving partitions, the controller moves the old replicas through the 
> following state changes -
> ONLINE -> OFFLINE -> NON_EXISTENT
> During the offline state change, the controller removes the old replica and 
> writes the updated ISR to zookeeper and notifies the leader. Note that it 
> doesn't notify the old replicas to stop fetching from the leader (to be fixed 
> in KAFKA-1032). During the non-existent state change, the controller does not 
> write the updated ISR or replica list to zookeeper. Right after the 
> non-existent state change, the controller writes the new replica list to 
> zookeeper, but does not update the ISR. So an old replica can send a fetch 
> request after the offline state change, essentially letting the leader add it 
> back to the ISR. The problem is that if there is no new data coming in for 
> the partition and the old replica is fully caught up, the leader cannot 
> remove it from the ISR. That lets a non existent replica live in the ISR at 
> least until new data comes in to the partition



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (KAFKA-1097) Race condition while reassigning low throughput partition leads to incorrect ISR information in zookeeper

Reply via email to