[ https://issues.apache.org/jira/browse/KAFKA-1767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14213173#comment-14213173 ]
Guozhang Wang commented on KAFKA-1767: -------------------------------------- Do you have a controller migration before this happened? If yes then you are very likely hitting KAFKA-1578. > /admin/reassign_partitions deleted before reassignment completes > ---------------------------------------------------------------- > > Key: KAFKA-1767 > URL: https://issues.apache.org/jira/browse/KAFKA-1767 > Project: Kafka > Issue Type: Bug > Components: controller > Affects Versions: 0.8.1.1 > Reporter: Ryan Berdeen > Assignee: Neha Narkhede > > https://github.com/apache/kafka/blob/0.8.1.1/core/src/main/scala/kafka/controller/KafkaController.scala#L477-L517 > describes the process of reassigning partitions. Specifically,by the time > {{/admin/reassign_partitions}} is updated, the newly assigned replicas (RAR) > should be in sync, and the assigned replicas (AR) in ZooKeeper should be > updated: > {code} > 4. Wait until all replicas in RAR are in sync with the leader. > ... > 10. Update AR in ZK with RAR. > 11. Update the /admin/reassign_partitions path in ZK to remove this partition. > {code} > This worked in 0.8.1, but in 0.8.1.1 we observe > {{/admin/reassign_partitions}} being removed before step 4 has completed. > For example, if we have AR [1,2] and then put [3,4] in > {{/admin/reassign_partitions}}, the cluster will end up with AR [1,2,3,4] and > ISR [1,2] when the key is removed. Eventually, the AR will be updated to > [3,4]. > This means that the {{kafka-reassign-partitions.sh}} tool will accept a new > batch of reassignments before the current reassignments have finished, and > our own tool that feeds in reassignments in small batches (see KAFKA-1677) > can't rely on this key to detect active reassignments. > Although we haven't observed this, it seems likely that if a controller > resignation happens, the new controller won't know that a reassignment is in > progress, and the AR will never be updated to the RAR. -- This message was sent by Atlassian JIRA (v6.3.4#6332)