[jira] [Commented] (KAFKA-7854) Behavior change in controller picking up partition reassignment tasks since 1.1.0

Adem Efe Gencer (JIRA) Tue, 22 Jan 2019 11:23:31 -0800


    [ 
https://issues.apache.org/jira/browse/KAFKA-7854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16749043#comment-16749043
 ]


Adem Efe Gencer commented on KAFKA-7854:
----------------------------------------

*Relevant Cruise Control Issue*: 
[Issue-496|https://github.com/linkedin/cruise-control/issues/496].

*Key issue*: Kafka lacks the proper public APIs for [Cruise 
Control|https://github.com/linkedin/cruise-control] (CC) to manage a cluster. 
In particular, Kafka does not support dynamically adding replica reassignments 
while there are ongoing reassignments. Hence, it is not possible to maintain a 
desired level of replica movement concurrency unless CC uses the ZK API.

The conundrum here is that ZK APIs are considered internal, and AdminClient is 
recommended whenever possible (See [~ijuma]'s comment on [a similar 
use-case|https://github.com/linkedin/cruise-control/issues/285#issuecomment-410455705]).
 However, as of today, these public APIs don't exist, which makes the silent 
behavior changes as in [PR-4143|https://github.com/apache/kafka/pull/4143] 
breaking for CC, which affects users adversely.

*Recommended solution*: Ideally, Kafka should provide AdminClient APIs to:

# Allow dynamically appending an additional set of replica reassignments 
regardless of whether there are ongoing replica reassignments, and
# Allow clean cancellation of ongoing replica movements (see 
[KAFKA-6304|https://issues.apache.org/jira/browse/KAFKA-6304]).

> Behavior change in controller picking up partition reassignment tasks since 
> 1.1.0
> ---------------------------------------------------------------------------------
>
>                 Key: KAFKA-7854
>                 URL: https://issues.apache.org/jira/browse/KAFKA-7854
>             Project: Kafka
>          Issue Type: Improvement
>          Components: controller
>            Reporter: Zhanxiang (Patrick) Huang
>            Priority: Major
>
> After [https://github.com/apache/kafka/pull/4143,] the controller does not 
> subscribe to data change on /admin/reassign_partitions any more (in order to 
> avoid unnecessarily loading the reassignment data again after controller 
> updating the znode) as opposed to the previous kafka versions. However, there 
> are systems built around kafka relying on the previous behavior to 
> incrementally update the list of partition reassignment since kafka does not 
> natively support that.
>  
> For example, [cruise control|https://github.com/linkedin/cruise-control] can 
> rely on the previous behavior (controller listening to data changes) to 
> maintain the reassignment concurrency by dynamically updating the data in the 
> reassignment znode instead of waiting for the current batch to finish and 
> doing reassignment batch by batch, which can significantly reduce the 
> rebalance time in production clusters. Although directly updating the znode 
> can somehow be viewed as an anti-pattern in the long term, this is necessary 
> since kafka does not natively support incrementally submit more reassignment 
> tasks. However, after our kafka clusters migrate from 0.11 to 2.0, cruise 
> control no longer works because the controller behavior has changed. This 
> reveals the following problems:
>  * These behavior changes may be viewed as internal changes so compatibility 
> is not guaranteed but I think by convention people do view this as public 
> interfaces and rely on the compatibility. In this case, I think we should 
> clearly document the data contract for the partition reassignment task to 
> avoid misusage and making controller changes that break the defined data 
> contract. There may be other cases (e.g. topic deletion) whose data contracts 
> need to be clearly defined and we should keep it in mind when making 
> controller changes.
>  * Kafka does not natively support incrementally submit more reassignment 
> tasks. If we do want to support that nicely, we should consider change how we 
> store the reassignment data to store the data in child nodes and let the 
> controller listen on child node changes, similar to what we do for 
> /admin/delete_topics.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (KAFKA-7854) Behavior change in controller picking up partition reassignment tasks since 1.1.0

Reply via email to