Neha Narkhede created KAFKA-1097:
------------------------------------

             Summary: Race condition while reassigning partition leads to 
incorrect ISR information in zookeeper 
                 Key: KAFKA-1097
                 URL: https://issues.apache.org/jira/browse/KAFKA-1097
             Project: Kafka
          Issue Type: Bug
          Components: controller
    Affects Versions: 0.8
            Reporter: Neha Narkhede
            Assignee: Neha Narkhede
            Priority: Critical


While moving partitions, the controller moves the old replicas through the 
following state changes -

ONLINE -> OFFLINE -> NON_EXISTENT

During the offline state change, the controller removes the old replica and 
writes the updated ISR to zookeeper and notifies the leader. Note that it 
doesn't notify the old replicas to stop fetching from the leader (to be fixed 
in KAFKA-1032). During the non-existent state change, the controller does not 
write the updated ISR or replica list to zookeeper. Right after the 
non-existent state change, the controller writes the new replica list to 
zookeeper, but does not update the ISR. So an old replica can send a fetch 
request after the offline state change, essentially letting the leader add it 
back to the ISR. That lets a non existent replica live in the ISR



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to