this would be great to add to the operational section of the Kafka documentation.
On Feb 5, 2014, at 2:18 PM, Andrew Otto <o...@wikimedia.org> wrote: >> - Increasing num.replica.fetchers (defaults is one) > Awesome! I just tried this one, bumped it up to 8 (12 cores on this broker > box). It is now catching up at around 17K msgs/sec, which will mean it will > finish in about 4 or 5 hours. I’ll check up on it again tomorrow. > > That should do it, Thanks! > > > > On Feb 5, 2014, at 5:04 PM, Joel Koshy <jjkosh...@gmail.com> wrote: > >> >>> topics are all caught up, but I have one high volume topic (around >>> 40K msgs/sec) that is taking much longer. I just took a few samples >>> of Replica-MaxLag to see how long it would take to catch up. >>> Currently, it is behind about 12.5 million messages and is catching >>> up at a rate of about 1600 msgs/sec. At that rate, it’ll take >>> around 9 days before the replica is caught up to the leader. >>> >>> Is there any way to speed this up? >> >> During the period your high-volume topic is under-replicated you can >> temporarily try one or both of the following: >> - Increasing num.replica.fetchers (defaults is one) >> - If you don't have too many topic-partitions you can also increase >> replica.fetch.max.bytes. >> >>> Or, alternatively, I don’t actually care about this topic’s >>> history. It is a new topic, and I know that it doesn't yet have any >>> consumers. I’d be fine with instructing both brokers to drop >>> old logs and just start from the top of the log. I could do this by >>> manually deleting the topic (kafka data files and in zookeeper), but >>> to do so properly with 0.8.0 I think I’d have to shut down the >>> whole cluster, correct? I’d rather not do this, as another >>> topic does have a consumer and I don’t want to lose messages for >>> it. >> >> Right - or you could do a rolling bounce and change the retention >> settings (http://kafka.apache.org/documentation.html#brokerconfigs) of >> that topic to something low so it gets expired and then do another >> rolling bounce to remove the override. >> >> -- >> Joel >