[ 
https://issues.apache.org/jira/browse/KAFKA-990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13737419#comment-13737419
 ] 

Swapnil Ghike commented on KAFKA-990:
-------------------------------------

Comments on v1:

I think a much clearer name for this tool would be ReassignReplicasCommand.

Admin tool:
11. I think the tool should also have a verify/validate option that you can run 
after the replica reassignment has been completed. As of now, the reassignment 
of a certain partition can fail and the admin won't know without looking at the 
controller log.
12. It would be good to make dry-run the default behaviour.
13. topics could be a json array, but I don't have a strong opinion one way or 
the other.
14. we should explicitly check that the options for the manual replica 
assignment and the assignment using brokerList should not be allowed at the 
same time. We should also check that the brokerList option is always provided 
with the topicsToMoveJsonFile option.
15. The "failed" messages could probably go to System.err.
16. We should probably not use the partition Id 0 in calling 
assignReplicasToBrokers, can we instead use { val topicAndPartition = 
groupedByTopic.get(topicInfo._1).get.get(0); val replicationFactor= 
topicInfo._2.get(topicAndPartition).get.size} ?
17. dryrun --> dry-run? I just remember seeing the latter more often across 
other tools.

On the controller:
21. I think a clearer name for initiateReassignPartitionForTopic would be 
initiateReassignReplicasForTopicPartition; and similar renames if any.
22. We should fix the comment in ReassignPartitionsIsrChangeListener
23. We should update controllerContext.partitionReplicaAssignment only once in 
KafkaController.updateAssignedReplicasForPartition(). There is an extra 
over-write as of now.
24. We should batch the requests in 
KafkaController.StopOldReplicasOfReassignedPartition() 
25. We should call startNewReplicasForReassignedPartition directly in 
initiateReassignPartitionForTopic instead of calling onPartitionReassignment. 
As of now, every time areReplicasInIsr returns fals, the controller will call 
startNewReplicasForReassignedPartition and then log a 
StateChangeFailedException, because the replicas were already in the new state. 
This exception will be logged in every call of onPartitionReassignment except 
for the first call.
26. We should remove the false condition from onPartitionReassignment
27. Currently, for each partition that is reassigned, controller deletes the 
/admin/reassign_partitions zk path, and populates it with a new list with the 
reassigned partition removed from the original list. This is probably an 
overkill, and we can delete the zk path completely once the reassignment of all 
partitions has completed successfully or in error. Even if there was a 
controller failover when the reassignment was in progress, the new controller 
should be able to decide which partitions have already been reassigned and 
which have not been in initiateReassignPartitionForTopic.
                
> Fix ReassignPartitionCommand and improve usability
> --------------------------------------------------
>
>                 Key: KAFKA-990
>                 URL: https://issues.apache.org/jira/browse/KAFKA-990
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Sriram Subramanian
>            Assignee: Sriram Subramanian
>         Attachments: KAFKA-990-v1.patch, KAFKA-990-v1-rebased.patch, 
> KAFKA-990-v2.patch
>
>
> 1. The tool does not register for IsrChangeListener on controller failover.
> 2. There is a race condition where the previous listener can fire on 
> controller failover and the replicas can be in ISR. Even after re-registering 
> the ISR listener after failover, it will never be triggered.
> 3. The input the tool is a static list which is very hard to use. To improve 
> this, as a first step the tool needs to take a list of topics and list of 
> brokers to do the assignment to and then generate the reassignment plan.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to