I will try to reproduce it. it was sporadic. My set up was a topic with 1 partition and replication factor = 3. If i kill the console producer and then shut down the leader broker, a new leader is elected. If I again kill the new lead, I dont see the last broker be elected as a leader. Then i tried starting the console producer, i started seeing errors.
On Tue, Jul 9, 2013 at 6:14 PM, Joel Koshy <jjkosh...@gmail.com> wrote: > Not really - if you shutdown a leader broker (and assuming your > replication factor is > 1) then the other assigned replica will be > elected as the new leader. The producer would then look up metadata, > find the new leader and send requests to it. What do you see in the > logs? > > Joel > > On Tue, Jul 9, 2013 at 1:44 PM, Calvin Lei <ckp...@gmail.com> wrote: > > Thanks you have me enough pointers to dig deeper. And I tested the fault > > tolerance by shutting down brokers randomly. > > > > What I noticed is if I shutdown brokers while my producer and consumer > are > > still running, they recover fine. However, if I shutdown a lead broker > > without a running producer, I can't seem to start the producer afterwards > > without restarting the previous lead broker. Is this expected? > > On Jul 9, 2013 10:28 AM, "Joel Koshy" <jjkosh...@gmail.com> wrote: > > > >> For 1 I forgot to add - there is an admin tool to reassign replicas but > it > >> would take longer than leader failover. > >> > >> Joel > >> > >> On Tuesday, July 9, 2013, Joel Koshy wrote: > >> > >> > 1 - no, unless broker4 is not the preferred leader. (The preferred > >> > leader is the first broker in the assigned replica list). If a > >> > non-preferred replica is the current leader you can run the > >> > PreferredReplicaLeaderElection admin command to move the leader. > >> > 2 - The actual leader movement (on leader failover) is fairly low - > >> > probably of the order of tens of ms. However, clients (producers, > >> > consumers) may take longer to detect that (it needs to get back an > >> > error response, handle an exception, issue a metadata request, get the > >> > response to find the new leader, and all that can add up but it should > >> > not be terribly high - I'm guessing on the order of a few hundred ms > >> > to a second or so). > >> > 3 - That should work, although the admin command for adding more > >> > partitions to a topic is currently being developed. > >> > > >> > > >> > On Mon, Jul 8, 2013 at 11:02 PM, Calvin Lei <ckp...@gmail.com> wrote: > >> > > Hi, > >> > > I have two questions regarding the kafka broker setup. > >> > > > >> > > 1. Assuming i have a 4-broker and 2-zookeeper (running in quorum > mode) > >> > > setup, if topicA-partition0 has the leader set to broker4, can I > change > >> > the > >> > > leader to other broker without killing the current leader? > >> > > > >> > > 2. What is the latency of switching to a different leader when the > >> > current > >> > > leader is down? Do we configure it using the consumer property - > >> > > refresh.leader.backoff.ms > >> > > > >> > > 3. What is the best practice of dynamically adding a new node to a > >> kafka > >> > > cluster? Should i bring up the node, and then increase the > replication > >> > > factor for the existing topic(s)? > >> > > > >> > > > >> > > thanks in advance, > >> > > Cal > >> > > >> >