[
https://issues.apache.org/jira/browse/KAFKA-1464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15372870#comment-15372870
]
Ralph Weires commented on KAFKA-1464:
-------------------------------------
Another related idea then, since those consumer rebalancing issues that result
during maintenance for us drove me up the walls yesterday... Just desperately
looking for a way to get this stabilized (on our v0.8.2.1) ;)
Wouldn't a (manual and temporary) modification of the partition assignment also
be a viable option, to prevent a given node from becoming leader for any
partitions?
I mean, could I issue kafka-reassign-partitions.sh with a customized partition
assignment, that wouldn't actually re-assign any partitions to different
brokers, but would merely change the replica *order* for several of the
partitions - such that the node in question no longer is first replica for any
partition? If I understand it right, the controller will always prefer the
first replica as leader in balancing, so I'd just need to make sure that my
node won't be the first replica for anything. All this temporarily of course,
so after the maintenance I'd restore the original partition assignment back
again.
Should this work, or would you expect specific problems with this workaround...?
Also: Let me know if this rather belongs onto the mailing list, since
admittedly it isn't really related to throttling... But as a side-remark in
this regard, I also tried throttling outside kafka (i.e. on side of the
network, tried via wondershaper) in our problem case, but that didn't help. I'd
agree this would need to be within kafka, i.e. to be able to separate
out-of-sync replica recovery traffic from the rest.
> Add a throttling option to the Kafka replication tool
> -----------------------------------------------------
>
> Key: KAFKA-1464
> URL: https://issues.apache.org/jira/browse/KAFKA-1464
> Project: Kafka
> Issue Type: New Feature
> Components: replication
> Affects Versions: 0.8.0
> Reporter: mjuarez
> Assignee: Ben Stopford
> Priority: Minor
> Labels: replication, replication-tools
> Fix For: 0.10.1.0
>
>
> When performing replication on new nodes of a Kafka cluster, the replication
> process will use all available resources to replicate as fast as possible.
> This causes performance issues (mostly disk IO and sometimes network
> bandwidth) when doing this in a production environment, in which you're
> trying to serve downstream applications, at the same time you're performing
> maintenance on the Kafka cluster.
> An option to throttle the replication to a specific rate (in either MB/s or
> activities/second) would help production systems to better handle maintenance
> tasks while still serving downstream applications.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)