[ https://issues.apache.org/jira/browse/KAFKA-1097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13802335#comment-13802335 ]
Joel Koshy commented on KAFKA-1097: ----------------------------------- [~guozhang] Neha's comment wrt not getting any more data and the leader not shrinking the ISR is when the old replica is already fully caught up. The time-based shrinking happens only if the replica was at a smaller log-end-offset. WRT this issue given that the old replica re-enters the ISR at the end of the new ISR, the likelihood of this being a issue (i.e., the old replica being elected as a leader) is relatively low (can you confirm?) That said, my preference would be the long-term fix on trunk instead of a partially correct fix on 0.8. However, the fact that we have an spurious/incorrect entry in the ISR list would skew the under-replicated partition count wouldn't it? In which case that would be a blocker issue. > Race condition while reassigning low throughput partition leads to incorrect > ISR information in zookeeper > ---------------------------------------------------------------------------------------------------------- > > Key: KAFKA-1097 > URL: https://issues.apache.org/jira/browse/KAFKA-1097 > Project: Kafka > Issue Type: Bug > Components: controller > Affects Versions: 0.8 > Reporter: Neha Narkhede > Assignee: Neha Narkhede > Priority: Critical > Fix For: 0.8 > > > While moving partitions, the controller moves the old replicas through the > following state changes - > ONLINE -> OFFLINE -> NON_EXISTENT > During the offline state change, the controller removes the old replica and > writes the updated ISR to zookeeper and notifies the leader. Note that it > doesn't notify the old replicas to stop fetching from the leader (to be fixed > in KAFKA-1032). During the non-existent state change, the controller does not > write the updated ISR or replica list to zookeeper. Right after the > non-existent state change, the controller writes the new replica list to > zookeeper, but does not update the ISR. So an old replica can send a fetch > request after the offline state change, essentially letting the leader add it > back to the ISR. The problem is that if there is no new data coming in for > the partition and the old replica is fully caught up, the leader cannot > remove it from the ISR. That lets a non existent replica live in the ISR at > least until new data comes in to the partition -- This message was sent by Atlassian JIRA (v6.1#6144)