Sure: https://issues.apache.org/jira/browse/KAFKA-1918
Thanks! Omid On Tue, Feb 3, 2015 at 5:32 PM, Guozhang Wang <wangg...@gmail.com> wrote: > Hi Omid, > > That is an interesting question.. This paragraph was written some time ago > and we have not test ZK failure / resume since, and it is hard to tell if > these cases still exist or not. > > One thing we can do is to add different ZK quorum failure scenarios to the > system test to have it covered over time. Could you file a JIRA? > > Guozhang > > On Tue, Feb 3, 2015 at 4:23 AM, Omid Aladini <omidalad...@gmail.com> > wrote: > > > Hi, > > > > Reading the official FAQ, I bumped into this paragraph: > > > > Once the Zookeeper quorum is down, brokers could result in a bad state > and > > > could not normally serve client requests, etc. Although when Zookeeper > > > quorum recovers, the Kafka brokers should be able to resume to normal > > state > > > automatically, there are still a few corner cases the they cannot and a > > > hard kill-and-recovery is required to bring it back to normal. Hence it > > is > > > recommended to closely monitor your zookeeper cluster and provision it > so > > > that it is performant. > > > > > > What are the corner cases exactly? Any JIRA tickets to explore? How do > the > > corner cases relate to the ZooKeeper cluster being "performant" and > > "closely monitored"? I'm specifically interested in the inevitable > scenario > > that the ZK leader exits / dies and the quorum goes down momentarily (due > > to hardware failure, rolling restart, etc). > > > > Thanks, > > Omid > > > > > > -- > -- Guozhang >