The broker still in ISR in ZK has all committed data. Thanks,
Jun On Thu, Jun 27, 2013 at 5:04 PM, Vadim Keylis <vkeylis2...@gmail.com> wrote: > Jun, > Does kafka provides ability to configure broker to be in in-sync before > become availalble? > Is it possible in case of all brokers crash to find out which node has the > most recent data to initiate proper startup procedure? > > Thanks, > Vadim > > > On Fri, Jun 21, 2013 at 8:24 PM, Jun Rao <jun...@gmail.com> wrote: > > > Hi, Bob, > > > > Thanks for reporting this. Yes, this is the current behavior when all > > brokers fail. Whichever broker comes back first becomes the new leader > and > > is the source of truth. This increases availability. However, previously > > committed data can be lost. This is what we call unclean leader > elections. > > Another option is instead to wait until a broker in in-sync replica set > to > > come back before electing a new leader. This will preserve all committed > > data at the expense of availability. The application can configure the > > system with the appropriate option based on its need. > > > > Thanks, > > > > Jun > > > > > > On Fri, Jun 21, 2013 at 4:08 PM, Bob Jervis < > > bjer...@visibletechnologies.com > > > wrote: > > > > > I wanted to send this out because we saw this in some testing we were > > > doing and wanted to advise the community of something to watch for in > 0.8 > > > HA support. > > > > > > We have a two machine cluster with replication factor 2. We took one > > > machine offline and re-formatted the disk. We re-installed the Kafka > > > software, but did not recreate any of the local disk files. The > > intention > > > was to simply re-start the broker process, but due to an error in the > > > network config that took some time to diagnose, we ended up with the > both > > > machines' brokers down. > > > > > > When we fixed the network config and restarted the brokers, we happened > > to > > > start the broker on the rebuilt machine first. The net result was when > > the > > > healthy broker came back online, the rebuilt machine was already the > > leader > > > and because of the Zookeeper state, it force the healthy broker to > delete > > > all of its topic data, thus wiping out the entire contents of the > > cluster. > > > > > > We are instituting operations procedures to safeguard against this > > > scenario in the future (and fortunately we only blew away a test > > cluster), > > > but this was a bit of a nasty surprise for a Friday. > > > > > > Bob Jervis > > > Visibletechnologies > > > > > > > > >