[ https://issues.apache.org/jira/browse/KAFKA-1300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13986026#comment-13986026 ]
Joel Koshy commented on KAFKA-1300: ----------------------------------- Understood, but the primary use case would be to proceed to do a controlled shutdown of the next broker in the shutdown plan. However, with retries and a large enough retry interval that is not needed. (E.g., you can set a very large number of retries.) The documentation recommends closely monitoring under-replicated-partition counts across the cluster (and alert if it is anything other than zero). i.e., ensuring brokers are in a fully replicated state is a "best-practice" for operations and should be 24/7 (not just during bounces). > Added WaitForReplaction admin tool. > ----------------------------------- > > Key: KAFKA-1300 > URL: https://issues.apache.org/jira/browse/KAFKA-1300 > Project: Kafka > Issue Type: New Feature > Components: tools > Affects Versions: 0.8.0 > Environment: Ubuntu 12.04 > Reporter: Brenden Matthews > Labels: patch > Fix For: 0.8.1 > > Attachments: 0001-Added-WaitForReplaction-admin-tool.patch > > > I have created a tool similar to the broker shutdown tool for doing rolling > restarts of Kafka clusters. > The tool watches the max replica lag of the specified broker, and waits until > the lag drops to 0 before exiting. > To do a rolling restart, here's the process we use: > for (broker <- brokers) { > run shutdown tool for broker > terminate broker > start new broker > run wait for replication tool on new broker > } > Here's an example command line use: > ./kafka-run-class.sh kafka.admin.WaitForReplication --zookeeper > zk.host.com:2181 --num.retries 100 --retry.interval.ms 60000 --broker 0 -- This message was sent by Atlassian JIRA (v6.2#6252)